Bio


Juan Carlos Niebles received an Engineering degree in Electronics from Universidad del Norte (Colombia) in 2002, an M.Sc. degree in Electrical and Computer Engineering from University of Illinois at Urbana-Champaign in 2007, and a Ph.D. degree in Electrical Engineering from Princeton University in 2011. He is a Senior Research Scientist at the Stanford AI Lab and Associate Director of Research at the Stanford-Toyota Center for AI Research since 2015. He is also an Associate Professor of Electrical and Electronic Engineering in Universidad del Norte (Colombia) since 2011. His research interests are in computer vision and machine learning, with a focus on visual recognition and understanding of human actions and activities, objects, scenes, and events. He is a recipient of a Google Faculty Research award (2015), the Microsoft Research Faculty Fellowship (2012), a Google Research award (2011) and a Fulbright Fellowship (2005).

Academic Appointments


  • Sr Research Engineer, Computer Science

Honors & Awards


  • Faculty Research Award, Google (2015)
  • Senior Member, IEEE (2015)
  • Faculty Fellow, Microsoft Research (2012)
  • Research Award, Google (2011)
  • Fulbright PhD Fellowship, Fulbright-Colciencias-DNP (2005)

Boards, Advisory Committees, Professional Organizations


  • Steering Committee, AI Index (2018 - Present)
  • Associate Director of Research, Stanford AI Lab-Toyota Center for AI Research (2015 - Present)
  • Senior Member, IEEE (2015 - Present)
  • Member, IEEE Computer Society (2014 - Present)
  • Member, IEEE (2007 - Present)

Professional Education


  • Ph.D., Princeton University, Electrical Engineering (2011)
  • M.A., Princeton University, Electrical Engineering (2009)
  • M.Sc., University of Illinois at Urbana-Champaign, Electrical and Computer Engineering (2007)
  • Engineer, Universidad del Norte, Electronics Engineering (2002)

Current Research and Scholarly Interests


My research work is in computer vision. The goal of my research is to enable computers and robots to perceive the visual world by developing novel computer vision algorithms for automatic analysis of images and videos. From the scientific point of view, we tackle fundamental open problems in computer vision research related to the visual recognition and understanding of human actions and activities, objects, scenes, and events. From the application perspective, we develop systems that solve practical world problems by introducing cutting-edge computer vision technologies into new application domains.

Stanford Advisees


All Publications


  • Quantifying Parkinson's disease motor severity under uncertainty using MDS-UPDRS videos. Medical image analysis Lu, M., Zhao, Q., Poston, K. L., Sullivan, E. V., Pfefferbaum, A., Shahid, M., Katz, M., Kouhsari, L. M., Schulman, K., Milstein, A., Niebles, J. C., Henderson, V. W., Fei-Fei, L., Pohl, K. M., Adeli, E. 2021; 73: 102179

    Abstract

    Parkinson's disease (PD) is a brain disorder that primarily affects motor function, leading to slow movement, tremor, and stiffness, as well as postural instability and difficulty with walking/balance. The severity of PD motor impairments is clinically assessed by part III of the Movement Disorder Society Unified Parkinson's Disease Rating Scale (MDS-UPDRS), a universally-accepted rating scale. However, experts often disagree on the exact scoring of individuals. In the presence of label noise, training a machine learning model using only scores from a single rater may introduce bias, while training models with multiple noisy ratings is a challenging task due to the inter-rater variabilities. In this paper, we introduce an ordinal focal neural network to estimate the MDS-UPDRS scores from input videos, to leverage the ordinal nature of MDS-UPDRS scores and combat class imbalance. To handle multiple noisy labels per exam, the training of the network is regularized via rater confusion estimation (RCE), which encodes the rating habits and skills of raters via a confusion matrix. We apply our pipeline to estimate MDS-UPDRS test scores from their video recordings including gait (with multiple Raters, R=3) and finger tapping scores (single rater). On a sizable clinical dataset for the gait test (N=55), we obtained a classification accuracy of 72% with majority vote as ground-truth, and an accuracy of ∼84% of our model predicting at least one of the raters' scores. Our work demonstrates how computer-assisted technologies can be used to track patients and their motor impairments, even when there is uncertainty in the clinical ratings. The latest version of the code will be available at https://github.com/mlu355/PD-Motor-Severity-Estimation.

    View details for DOI 10.1016/j.media.2021.102179

    View details for PubMedID 34340101

  • Vision-based Estimation of MDS-UPDRS Gait Scores for Assessing Parkinson's Disease Motor Severity. Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention Lu, M., Poston, K., Pfefferbaum, A., Sullivan, E. V., Fei-Fei, L., Pohl, K. M., Niebles, J. C., Adeli, E. 2020; 12263: 637–47

    Abstract

    Parkinson's disease (PD) is a progressive neurological disorder primarily affecting motor function resulting in tremor at rest, rigidity, bradykinesia, and postural instability. The physical severity of PD impairments can be quantified through the Movement Disorder Society Unified Parkinson's Disease Rating Scale (MDS-UPDRS), a widely used clinical rating scale. Accurate and quantitative assessment of disease progression is critical to developing a treatment that slows or stops further advancement of the disease. Prior work has mainly focused on dopamine transport neuroimaging for diagnosis or costly and intrusive wearables evaluating motor impairments. For the first time, we propose a computer vision-based model that observes non-intrusive video recordings of individuals, extracts their 3D body skeletons, tracks them through time, and classifies the movements according to the MDS-UPDRS gait scores. Experimental results show that our proposed method performs significantly better than chance and competing methods with an F 1-score of 0.83 and a balanced accuracy of 81%. This is the first benchmark for classifying PD patients based on MDS-UPDRS gait severity and could be an objective biomarker for disease severity. Our work demonstrates how computer-assisted technologies can be used to non-intrusively monitor patients and their motor impairments. The code is available at https://github.com/mlu355/PD-Motor-Severity-Estimation.

    View details for DOI 10.1007/978-3-030-59716-0_61

    View details for PubMedID 33103164

  • Socially and Contextually Aware Human Motion and Pose Forecasting IEEE ROBOTICS AND AUTOMATION LETTERS Adeli, V., Adeli, E., Reid, I., Niebles, J., Rezatofighi, H. 2020; 5 (4): 6033–40
  • Explaining VQA predictions using visual grounding and a knowledge base IMAGE AND VISION COMPUTING Riquelme, F., De Goyeneche, A., Zhang, Y., Niebles, J., Soto, A. 2020; 101
  • Segmenting the Future IEEE ROBOTICS AND AUTOMATION LETTERS Chiu, H., Adeli, E., Niebles, J. 2020; 5 (3): 4202–9
  • Spatiotemporal Relationship Reasoning for Pedestrian Intent Prediction IEEE ROBOTICS AND AUTOMATION LETTERS Liu, B., Adeli, E., Cao, Z., Lee, K., Shenoi, A., Gaidon, A., Niebles, J. 2020; 5 (2): 3485–92
  • Disentangling Human Dynamics for Pedestrian Locomotion Forecasting with Noisy Supervision Mangalam, K., Adeli, E., Lee, K., Gaidon, A., Niebles, J., IEEE Comp Soc IEEE COMPUTER SOC. 2020: 2773–82
  • Adversarial Cross-Domain Action Recognition with Co-Attention Pan, B., Cao, Z., Adeli, E., Niebles, J., Assoc Advancement Artificial Intelligence ASSOC ADVANCEMENT ARTIFICIAL INTELLIGENCE. 2020: 11815-11822
  • Action-Agnostic Human Pose Forecasting Chiu, H., Adeli, E., Wang, B., Huang, D., Niebles, J., IEEE IEEE. 2019: 1423–32
  • Learning Temporal Action ProposalsWith Fewer Labels Ji, J., Cao, K., Niebles, J., IEEE IEEE. 2019: 7072–81
  • Imitation Learning for Human Pose Prediction Wang, B., Adeli, E., Chiu, H., Huang, D., Niebles, J., IEEE IEEE. 2019: 7123–32
  • Continuous Relaxation of Symbolic Planner for One-Shot Imitation Learning Huang, D., Xu, D., Zhu, Y., Garg, A., Savarese, S., Fei-Fei, L., Niebles, J., IEEE IEEE. 2019: 2635–42
  • Peeking into the Future: Predicting Future Person Activities and Locations in Videos Liang, J., Jiang, L., Niebles, J., Hauptmann, A., Li Fei-Fei, IEEE Comp Soc IEEE COMPUTER SOC. 2019: 5718–27
  • (DTW)-T-3: Discriminative Differentiable Dynamic Time Warping for Weakly Supervised Action Alignment and Segmentation Chang, C., Huang, D., Sui, Y., Li Fei-Fei, Niebles, J., IEEE Comp Soc IEEE COMPUTER SOC. 2019: 3541–50
  • Peeking into the Future: Predicting Future Person Activities and Locations in Videos Liang, J., Jiang, L., Niebles, J., Hauptmann, A., Li Fei-Fei, IEEE IEEE. 2019: 2960–63
  • Neural Task Graphs: Generalizing to Unseen Tasks from a Single Video Demonstration Huang, D., Nair, S., Xu, D., Zhu, Y., Garg, A., Li Fei-Fei, Savarese, S., Niebles, J., IEEE Comp Soc IEEE. 2019: 8557–66
  • Interpretable Visual Question Answering by Visual Grounding from Attention Supervision Mining Zhang, Y., Niebles, J., Soto, A., IEEE IEEE. 2019: 349–57
  • Learning to Decompose and Disentangle Representations for Video Prediction Hsieh, J., Liu, B., Huang, D., Fei-Fei, L., Niebles, J., Bengio, S., Wallach, H., Larochelle, H., Grauman, K., CesaBianchi, N., Garnett, R. NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2018
  • Finding "It": Weakly-Supervised Reference-Aware Visual Grounding in Instructional Videos Huang, D., Buch, S., Dery, L., Garg, A., Li Fei-Fei, Niebles, J., IEEE IEEE. 2018: 5948–57
  • What Makes a Video a Video: Analyzing Temporal Information in Video Understanding Models and Datasets Huang, D., Ramanathan, V., Mahajan, D., Torresani, L., Paluri, M., Li Fei-Fei, Niebles, J., IEEE IEEE. 2018: 7366–75
  • Sparse composition of body poses and atomic actions for human activity recognition in RGB-D videos IMAGE AND VISION COMPUTING Lillo, I., Niebles, J., Soto, A. 2017; 59: 63–75
  • Risky Region Localization with Point Supervision Kozuka, K., Niebles, J., IEEE IEEE. 2017: 246–53
  • Dense-Captioning Events in Videos Krishna, R., Hata, K., Ren, F., Fei-Fei, L., Niebles, J., IEEE IEEE. 2017: 706–15
  • Visual Forecasting by Imitating Dynamics in Natural Sequences Zeng, K., Shen, W. B., Huang, D., Sun, M., Niebles, J., IEEE IEEE. 2017: 3018–27
  • Unsupervised Visual-Linguistic Reference Resolution in Instructional Videos Huang, D., Lim, J. J., Fei-Fei, L., Niebles, J., IEEE IEEE. 2017: 1032–41
  • Agent-Centric Risk Assessment: Accident Anticipation and Risky Region Localization Zeng, K., Chou, S., Chan, F., Niebles, J., Sun, M., IEEE IEEE. 2017: 1330–38
  • A Hierarchical Pose-Based Approach to Complex Action Understanding Using Dictionaries of Actionlets and Motion Poselets Lillo, I., Niebles, J., Soto, A., IEEE IEEE. 2016: 1981–90
  • Title Generation for User Generated Videos Zeng, K., Chen, T., Niebles, J., Sun, M., Leibe, B., Matas, J., Sebe, N., Welling, M. SPRINGER INTERNATIONAL PUBLISHING AG. 2016: 609–25
  • DAPs: Deep Action Proposals for Action Understanding Escorcia, V., Heilbron, F., Niebles, J., Ghanem, B., Leibe, B., Matas, J., Sebe, N., Welling, M. SPRINGER INTERNATIONAL PUBLISHING AG. 2016: 768–84
  • Connectionist Temporal Modeling for Weakly Supervised Action Labeling Huang, D., Li Fei-Fei, Niebles, J., Leibe, B., Matas, J., Sebe, N., Welling, M. SPRINGER INTERNATIONAL PUBLISHING AG. 2016: 137–53
  • Fast Temporal Activity Proposals for Efficient Detection of Human Actions in Untrimmed Videos Heilbron, F., Niebles, J., Ghanem, B., IEEE IEEE. 2016: 1914–23
  • Modeling Temporal Structure of Decomposable Motion Segments for Activity Classification 11th European Conference on Computer Vision Niebles, J. C., Chen, C., Li Fei-Fei, F. F. SPRINGER-VERLAG BERLIN. 2010: 392–405