Professional Affiliations and Activities


  • Member, IEEE (2022 - Present)

All Publications


  • GRAD-NAV++: Vision-Language Model Enabled Visual Drone <underline>Nav</underline>igation With <underline>G</underline>aussian <underline>Ra</underline>diance Fields and <underline>D</underline>ifferentiable Dynamics IEEE ROBOTICS AND AUTOMATION LETTERS Chen, Q., Gao, N., Huang, S., Low, J., Chen, T., Sun, J., Schwager, M. 2026; 11 (2): 1418-1425
  • Foundation models in robotics: Applications, challenges, and the future INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH Firoozi, R., Tucker, J., Tian, S., Majumdar, A., Sun, J., Liu, W., Zhu, Y., Song, S., Kapoor, A., Hausman, K., Ichter, B., Driess, D., Wu, J., Lu, C., Schwager, M. 2024
  • Localization and recognition of human action in 3D using transformers. Communications engineering Sun, J., Huang, L., Wang, H., Zheng, C., Qiu, J., Islam, M. T., Xie, E., Zhou, B., Xing, L., Chandrasekaran, A., Black, M. J. 2024; 3 (1): 125

    Abstract

    Understanding a person's behavior from their 3D motion sequence is a fundamental problem in computer vision with many applications. An important component of this problem is 3D action localization, which involves recognizing what actions a person is performing, and when the actions occur in the sequence. To promote the progress of the 3D action localization community, we introduce a new, challenging, and more complex benchmark dataset, BABEL-TAL (BT), for 3D action localization. Important baselines and evaluating metrics, as well as human evaluations, are carefully established on this benchmark. We also propose a strong baseline model, i.e., Localizing Actions with Transformers (LocATe), that jointly localizes and recognizes actions in a 3D sequence. The proposed LocATe shows superior performance on BABEL-TAL as well as on the large-scale PKU-MMD dataset, achieving state-of-the-art performance by using only 10% of the labeled training data. Our research could advance the development of more accurate and efficient systems for human behavior analysis, with potential applications in areas such as human-computer interaction and healthcare.

    View details for DOI 10.1038/s44172-024-00272-7

    View details for PubMedID 39227676

    View details for PubMedCentralID PMC11372174

  • Open X-Embodiment: Robotic Learning Datasets and RT-X Models O'Neill, A., Rehman, A., Gupta, A., Maddukuri, A., Gupta, A., Padalkar, A., Lee, A., Pooley, A., Gupta, A., Mandlekar, A., Jain, A., Tung, A., Bewley, A., Herzog, A., Irpan, A., Khazatsky, A., Rai, A., Gupta, A., Wang, A., Kolobov, A., Singh, A., Garg, A., Kembhavi, A., Xie, A., Brohan, A., Raffin, A., Sharma, A., Yavary, A., Jain, A., Balakrishna, A., Wahid, A., Burgess-Limerick, B., Kim, B., Scholkopf, B., Wulfe, B., Ichter, B., Lu, C., Xu, C., Le, C., Finn, C., Wang, C., Xu, C., Chi, C., Huang, C., Chan, C., Agia, C., Pan, C., Fu, C., Devin, C., Xu, D., Morton, D., Driess, D., Chen, D., Pathak, D., Shah, D., Buchler, D., Jayaraman, D., Kalashnikov, D., Sadigh, D., Johns, E., Foster, E., Liu, F., Ceola, F., Xia, F., Zhao, F., Frujeri, F., Stulp, F., Zhou, G., Sukhatme, G. S., Salhotra, G., Yan, G., Feng, G., Schiavi, G., Berseth, G., Kahn, G., Yang, G., Wang, G., Su, H., Fang, H., Shi, H., Bao, H., Ben Amor, H., Christensen, H., Furuta, H., Bharadhwaj, H., Walke, H., Fang, H., Ha, H., Mordatch, I., Radosavovic, I., Leal, I., Liang, J., Abou-Chakra, J., Kim, J., Drake, J., Peters, J., Schneider, J., Hsu, J., Vakil, J., Bohg, J., Bingham, J., Wu, J., Gao, J., Hu, J., Wu, J., Wu, J., Sun, J., Luo, J., Gu, J., Tan, J., Oh, J., Wu, J., Lu, J., Yang, J., Malik, J., Silverio, J., Hejna, J., Booher, J., Tompson, J., Yang, J., Salvador, J., Lim, J. J., Han, J., Wang, K., Rao, K., Pertsch, K., Hausman, K., Go, K., Gopalakrishnan, K., Goldberg, K., Byrne, K., Oslund, K., Kawaharazuka, K., Black, K., Lin, K., Zhang, K., Ehsani, K., Lekkala, K., Ellis, K., Rana, K., Srinivasan, K., Fang, K., Singh, K., Zeng, K., Hatch, K., Hsu, K., Itti, L., Chen, L., Pinto, L., Li Fei-Fei, Tan, L., Fan, L., Ott, L., Lee, L., Weihs, L., Chen, M., Lepert, M., Memmel, M., Tomizuka, M., Itkina, M., Castro, M., Spero, M., Du, M., Ahn, M., Yip, M. C., Zhang, M., Ding, M., Heo, M., Srirama, M., Sharma, M., Kim, M., Kanazawa, N., Hansen, N., Heess, N., Joshi, N. J., Suenderhauf, N., Liu, N., Di Palo, N., Shafiullah, N., Mees, O., Kroemer, O., Bastani, O., Sanketi, P. R., Miller, P., Yin, P., Wohlhart, P., Xu, P., Fagan, P., Mitrano, P., Sermanet, P., Abbeel, P., Sundaresan, P., Chen, Q., Vuong, Q., Rafailov, R., Tian, R., Doshi, R., Martin-Martin, R., Baijal, R., Scalise, R., Hendrix, R., Lin, R., Qian, R., Zhang, R., Mendonca, R., Shah, R., Hoque, R., Julian, R., Bustamante, S., Kirmani, S., Levine, S., Lin, S., Moore, S., Bahl, S., Dass, S., Sonawani, S., Tulsiani, S., Song, S., Xu, S., Haldar, S., Karamcheti, S., Adebola, S., Guist, S., Nasiriany, S., Schaal, S., Welker, S., Tian, S., Ramamoorthy, S., Dasari, S., Belkhale, S., Park, S., Nair, S., Mirchandani, S., Osa, T., Gupta, T., Harada, T., Matsushima, T., Xiao, T., Kollar, T., Yu, T., Ding, T., Davchev, T., Zhao, T. Z., Armstrong, T., Darrell, T., Chung, T., Jain, V., Kumar, V., Vanhoucke, V., Zhan, W., Zhou, W., Burgard, W., Chen, X., Chen, X., Wang, X., Zhu, X., Geng, X., Liu, X., Xu Liangwei, Li, X., Pang, Y., Lu, Y., Ma, Y., Kim, Y., Chebotar, Y., Zhou, Y., Zhu, Y., Wu, Y., Xu, Y., Wang, Y., Bisk, Y., Dou, Y., Cho, Y., Lee, Y., Cui, Y., Cao, Y., Wu, Y., Tang, Y., Zhu, Y., Zhang, Y., Jiang, Y., Li, Y., Li, Y., Iwasawa, Y., Matsuo, Y., Ma, Z., Xu, Z., Cui, Z., Zhang, Z., Fu, Z., Lin, Z., IEEE IEEE. 2024: 6892-6903
  • Large AI Models in Health Informatics: Applications, Challenges, and the Future. IEEE journal of biomedical and health informatics Qiu, J., Li, L., Sun, J., Peng, J., Shi, P., Zhang, R., Dong, Y., Lam, K., Lo, F. P., Xiao, B., Yuan, W., Wang, N., Xu, D., Lo, B. 2023; 27 (12): 6074-6087

    Abstract

    Large AI models, or foundation models, are models recently emerging with massive scales both parameter-wise and data-wise, the magnitudes of which can reach beyond billions. Once pretrained, large AI models demonstrate impressive performance in various downstream tasks. A prime example is ChatGPT, whose capability has compelled people's imagination about the far-reaching influence that large AI models can have and their potential to transform different domains of our lives. In health informatics, the advent of large AI models has brought new paradigms for the design of methodologies. The scale of multi-modal data in the biomedical and health domain has been ever-expanding especially since the community embraced the era of deep learning, which provides the ground to develop, validate, and advance large AI models for breakthroughs in health-related areas. This article presents a comprehensive review of large AI models, from background to their applications. We identify seven key sectors in which large AI models are applicable and might have substantial influence, including: 1) bioinformatics; 2) medical diagnosis; 3) medical imaging; 4) medical informatics; 5) medical education; 6) public health; and 7) medical robotics. We examine their challenges, followed by a critical discussion about potential future directions and pitfalls of large AI models in transforming the field of health informatics.

    View details for DOI 10.1109/JBHI.2023.3316750

    View details for PubMedID 37738186

  • NeRF-Loc: Transformer-Based Object Localization Within Neural Radiance Fields IEEE ROBOTICS AND AUTOMATION LETTERS Sun, J., Xu, Y., Ding, M., Yi, H., Wang, C., Wang, J., Zhang, L., Schwager, M. 2023; 8 (8): 5244-5250
  • Connected Autonomous Vehicle Motion Planning with Video Predictions from Smart, Self-Supervised Infrastructure Sun, J., Kousik, S., Fridovich-Keil, D., Schwager, M., IEEE IEEE. 2023: 1721-1726
  • MimicPlay: Long-Horizon Imitation Learning by Watching Human Play Wang, C., Fan, L., Sun, J., Zhang, R., Li Fei-Fei, Xu, D., Zhu, Y., Anandkumar, A. edited by Tan, J., Toussaint, M., Darvish, K. JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2023
  • Conformal Prediction for Uncertainty-Aware Planning with Diffusion Dynamics Model Sun, J., Jiang, Y., Qiu, J., Nobel, P., Kochenderfer, M., Schwager, M. edited by Oh, A., Neumann, T., Globerson, A., Saenko, K., Hardt, M., Levine, S. NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2023
  • Egocentric Human Trajectory Forecasting With a Wearable Camera and Multi-Modal Fusion IEEE ROBOTICS AND AUTOMATION LETTERS Qiu, J., Chen, L., Gu, X., Lo, F., Tsai, Y., Sun, J., Liu, J., Lo, B. 2022; 7 (4): 8799-8806
  • PlaTe: Visually-Grounded Planning With Transformers in Procedural Tasks IEEE ROBOTICS AND AUTOMATION LETTERS Sun, J., Huang, D., Lu, B., Liu, Y., Zhou, B., Garg, A. 2022; 7 (2): 4924-4930
  • Self-Supervised Traffic Advisors: Distributed, Multi-view Traffic Prediction for Smart Cities Sun, J., Kousik, S., Fridovich-Keil, D., Schwager, M., IEEE IEEE. 2022: 917-922
  • Adversarial Inverse Reinforcement Learning With Self-Attention Dynamics Model IEEE ROBOTICS AND AUTOMATION LETTERS Sun, J., Yu, L., Dong, P., Lu, B., Zhou, B. 2021; 6 (2): 1880–86