Jiankai Sun
Ph.D. Student in Aeronautics and Astronautics, admitted Autumn 2022
Professional Affiliations and Activities
-
Member, IEEE (2022 - Present)
All Publications
-
GRAD-NAV++: Vision-Language Model Enabled Visual Drone <underline>Nav</underline>igation With <underline>G</underline>aussian <underline>Ra</underline>diance Fields and <underline>D</underline>ifferentiable Dynamics
IEEE ROBOTICS AND AUTOMATION LETTERS
2026; 11 (2): 1418-1425
View details for DOI 10.1109/LRA.2025.3643290
View details for Web of Science ID 001641470800007
-
Foundation models in robotics: Applications, challenges, and the future
INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH
2024
View details for DOI 10.1177/02783649241281508
View details for Web of Science ID 001319814900001
-
Localization and recognition of human action in 3D using transformers.
Communications engineering
2024; 3 (1): 125
Abstract
Understanding a person's behavior from their 3D motion sequence is a fundamental problem in computer vision with many applications. An important component of this problem is 3D action localization, which involves recognizing what actions a person is performing, and when the actions occur in the sequence. To promote the progress of the 3D action localization community, we introduce a new, challenging, and more complex benchmark dataset, BABEL-TAL (BT), for 3D action localization. Important baselines and evaluating metrics, as well as human evaluations, are carefully established on this benchmark. We also propose a strong baseline model, i.e., Localizing Actions with Transformers (LocATe), that jointly localizes and recognizes actions in a 3D sequence. The proposed LocATe shows superior performance on BABEL-TAL as well as on the large-scale PKU-MMD dataset, achieving state-of-the-art performance by using only 10% of the labeled training data. Our research could advance the development of more accurate and efficient systems for human behavior analysis, with potential applications in areas such as human-computer interaction and healthcare.
View details for DOI 10.1038/s44172-024-00272-7
View details for PubMedID 39227676
View details for PubMedCentralID PMC11372174
-
Open X-Embodiment: Robotic Learning Datasets and RT-X Models
IEEE. 2024: 6892-6903
View details for DOI 10.1109/ICRA57147.2024.10611477
View details for Web of Science ID 001294576205021
-
Large AI Models in Health Informatics: Applications, Challenges, and the Future.
IEEE journal of biomedical and health informatics
2023; 27 (12): 6074-6087
Abstract
Large AI models, or foundation models, are models recently emerging with massive scales both parameter-wise and data-wise, the magnitudes of which can reach beyond billions. Once pretrained, large AI models demonstrate impressive performance in various downstream tasks. A prime example is ChatGPT, whose capability has compelled people's imagination about the far-reaching influence that large AI models can have and their potential to transform different domains of our lives. In health informatics, the advent of large AI models has brought new paradigms for the design of methodologies. The scale of multi-modal data in the biomedical and health domain has been ever-expanding especially since the community embraced the era of deep learning, which provides the ground to develop, validate, and advance large AI models for breakthroughs in health-related areas. This article presents a comprehensive review of large AI models, from background to their applications. We identify seven key sectors in which large AI models are applicable and might have substantial influence, including: 1) bioinformatics; 2) medical diagnosis; 3) medical imaging; 4) medical informatics; 5) medical education; 6) public health; and 7) medical robotics. We examine their challenges, followed by a critical discussion about potential future directions and pitfalls of large AI models in transforming the field of health informatics.
View details for DOI 10.1109/JBHI.2023.3316750
View details for PubMedID 37738186
-
NeRF-Loc: Transformer-Based Object Localization Within Neural Radiance Fields
IEEE ROBOTICS AND AUTOMATION LETTERS
2023; 8 (8): 5244-5250
View details for DOI 10.1109/LRA.2023.3293308
View details for Web of Science ID 001030616500013
-
Connected Autonomous Vehicle Motion Planning with Video Predictions from Smart, Self-Supervised Infrastructure
IEEE. 2023: 1721-1726
View details for Web of Science ID 001178996701109
-
MimicPlay: Long-Horizon Imitation Learning by Watching Human Play
edited by Tan, J., Toussaint, M., Darvish, K.
JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2023
View details for Web of Science ID 001221201500009
-
Conformal Prediction for Uncertainty-Aware Planning with Diffusion Dynamics Model
edited by Oh, A., Neumann, T., Globerson, A., Saenko, K., Hardt, M., Levine, S.
NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2023
View details for Web of Science ID 001220600004034
-
Egocentric Human Trajectory Forecasting With a Wearable Camera and Multi-Modal Fusion
IEEE ROBOTICS AND AUTOMATION LETTERS
2022; 7 (4): 8799-8806
View details for DOI 10.1109/LRA.2022.3188101
View details for Web of Science ID 000838567100031
-
PlaTe: Visually-Grounded Planning With Transformers in Procedural Tasks
IEEE ROBOTICS AND AUTOMATION LETTERS
2022; 7 (2): 4924-4930
View details for DOI 10.1109/LRA.2022.3150855
View details for Web of Science ID 000766627200007
-
Self-Supervised Traffic Advisors: Distributed, Multi-view Traffic Prediction for Smart Cities
IEEE. 2022: 917-922
View details for DOI 10.1109/ITSC55140.2022.9922340
View details for Web of Science ID 000934720600142
-
Adversarial Inverse Reinforcement Learning With Self-Attention Dynamics Model
IEEE ROBOTICS AND AUTOMATION LETTERS
2021; 6 (2): 1880–86
View details for DOI 10.1109/LRA.2021.3061397
View details for Web of Science ID 000629028400024