Nick Haber is an Assistant Professor at the Stanford Graduate School of Education, and by courtesy, Computer Science. After receiving his PhD in mathematics on Partial Differential Equation theory, he worked on Sension, a company that applied computer vision to online education. He then co-founded the Autism Glass Project at Stanford, a research effort that employs wearable technology and computer vision in a tool for children with autism. Aside from such work on learning and therapeutic tools, he and his research group develop artificial intelligence systems meant to mimic and model the ways people learn early in life, exploring their environments through play, social interaction, and curiosity.

Academic Appointments

Administrative Appointments

  • Postdoctoral Fellow, Stanford University (2014 - 2019)
  • Postdoctoral Fellow, McGill University (2014 - 2014)
  • Postdoctoral Fellow, Mathematical Sciences Research Institute (2013 - 2013)
  • Graduate Student, Stanford University (2009 - 2013)
  • Undergraduate Research, Brown University (2007 - 2008)
  • Undergraduate Research, Brown University (2006 - 2007)
  • NSF Mathematics REU, Lafayette College (2005 - 2005)

Honors & Awards

  • Walter V. and Idun Berry Postdoctoral Fellow, Stanford University (2015)
  • Magna Cum Laude, Brown University (2008)
  • Member, Phi Beta Kappa (2006)

Boards, Advisory Committees, Professional Organizations

  • Chief Scientific Officer, Sension, Inc (2013 - Present)

Program Affiliations

  • Symbolic Systems Program

Professional Education

  • Sc.B., Brown University, Mathematics & Economics (2008)
  • Ph.D., Stanford University, Mathematics (2013)


  • Nicholas Haber, Catalin Voss. "United States Patent Application 14/275851 Systems and methods for detection of behavior correlated with outside distractions in examinations"
  • Nicholas Haber, Catalin Voss. "United States Patent Application 61/821,921 System and Method for Analysis of Visual Viewer Reactions to Video Content. US Application"

Research Interests

  • Assessment, Testing and Measurement
  • Brain and Learning Sciences
  • Child Development
  • Collaborative Learning
  • Data Sciences
  • Early Childhood
  • Motivation
  • Psychology
  • Social and Emotional Learning
  • Special Education
  • Technology and Education

Current Research and Scholarly Interests

I use AI models of of exploratory and social learning in order to better understand early human learning and development, and conversely, I use our understanding of early human learning to make robust AI models that learn in exploratory and social ways. Based on this, I develop AI-powered learning tools for children, geared in particular towards the education of those with developmental issues such as the Autism Spectrum Disorder and Attention Deficit Hyperactivity Disorder, in the mold of my work on the Autism Glass Project. My formal graduate training in pure mathematics involved extending partial differential equation theory in cases involving the propagation of waves through complex media such as the space around a black hole. Since then, I have transitioned to the use of machine learning in developing both learning tools for children with developmental disorders and AI and cognitive models of learning.

2023-24 Courses

Stanford Advisees

All Publications

  • Examining the potential and pitfalls of ChatGPT in science and engineering problem-solving FRONTIERS IN EDUCATION Wang, K. D., Burkholder, E., Wieman, C., Salehi, S., Haber, N. 2024; 8
  • Discovering Players′ Problem-Solving Behavioral Characteristics in a Puzzle Game through Sequence Mining Wang, K. D., Liu, H., DeLiema, D., Haber, N., Salehi, S., Assoc Computing Machinery ASSOC COMPUTING MACHINERY. 2024: 498-506
  • Binding the Person-Specific Approach to Modern AI in the Human Screenome Project: Moving past Generalizability to Transferability. Multivariate behavioral research Ram, N., Haber, N., Robinson, T. N., Reeves, B. 2023: 1-9


    Advances in ability to comprehensively record individuals' digital lives and in AI modeling of those data facilitate new possibilities for describing, predicting, and generating a wide variety of behavioral processes. In this paper, we consider these advances from a person-specific perspective, including whether the pervasive concerns about generalizability of results might be productively reframed with respect to transferability of models, and how self-supervision and new deep neural network architectures that facilitate transfer learning can be applied in a person-specific way to the super-intensive longitudinal data arriving in the Human Screenome Project. In developing the possibilities, we suggest Molenaar add a statement to the person-specific Manifesto - "In short, the concerns about generalizability commonly leveled at the person-specific paradigm are unfounded and can be fully and completely replaced with discussion and demonstrations of transferability."

    View details for DOI 10.1080/00273171.2023.2229305

    View details for PubMedID 37439508

  • Communication Skills Training Using Remote Augmented Reality Medical Simulation: a Feasibility and Acceptability Qualitative Study. Medical science educator Hess, O., Qian, J., Bruce, J., Wang, E., Rodriguez, S., Haber, N., Caruso, T. J. 2022: 1-10


    Introduction: Augmented reality (AR) has promise as a clinical teaching tool, particularly for remote learning. The Chariot Augmented Reality Medical (CHARM) simulator integrates real-time communication into a portable medical simulator with a holographic patient and monitor. The primary aim was to analyze feedback from medical and physician assistant students regarding acceptability and feasibility of the simulator.Methods: Using the CHARM simulator, we created an advanced cardiovascular life support (ACLS) simulation scenario. After IRB approval, preclinical medical and physician assistant students volunteered to participate from August to September 2020. We delivered augmented reality headsets (Magic Leap One) to students before the study. Prior to the simulation, via video conference, we introduced students to effective communication skills during a cardiac arrest. Participants then, individually and remotely from their homes, synchronously completed an instructor-led ACLS AR simulation in groups of three. After the simulation, students participated in a structured focus group using a qualitative interview guide. Our study team coded their responses and interpreted them using team-based thematic analysis.Results: Eighteen medical and physician assistant students participated. We identified four domains that reflected trainee experiences: experiential satisfaction, learning engagement, technology learning curve, and opportunities for improvement. Students reported that the simulator was acceptable and enjoyable for teaching trainees communication skills; however, there were some technical difficulties associated with initial use.Conclusion: This study suggests that multiplayer AR is a promising and feasible approach for remote medical education of communication skills during medical crises.Supplementary Information: The online version contains supplementary material available at 10.1007/s40670-022-01598-7.

    View details for DOI 10.1007/s40670-022-01598-7

    View details for PubMedID 35966166

  • Improved Digital Therapy for Developmental Pediatrics Using Domain-Specific Artificial Intelligence: Machine Learning Study. JMIR pediatrics and parenting Washington, P., Kalantarian, H., Kent, J., Husic, A., Kline, A., Leblanc, E., Hou, C., Mutlu, O. C., Dunlap, K., Penev, Y., Varma, M., Stockham, N. T., Chrisman, B., Paskov, K., Sun, M. W., Jung, J. Y., Voss, C., Haber, N., Wall, D. P. 2022; 5 (2): e26760


    Automated emotion classification could aid those who struggle to recognize emotions, including children with developmental behavioral conditions such as autism. However, most computer vision emotion recognition models are trained on adult emotion and therefore underperform when applied to child faces.We designed a strategy to gamify the collection and labeling of child emotion-enriched images to boost the performance of automatic child emotion recognition models to a level closer to what will be needed for digital health care approaches.We leveraged our prototype therapeutic smartphone game, GuessWhat, which was designed in large part for children with developmental and behavioral conditions, to gamify the secure collection of video data of children expressing a variety of emotions prompted by the game. Independently, we created a secure web interface to gamify the human labeling effort, called HollywoodSquares, tailored for use by any qualified labeler. We gathered and labeled 2155 videos, 39,968 emotion frames, and 106,001 labels on all images. With this drastically expanded pediatric emotion-centric database (>30 times larger than existing public pediatric emotion data sets), we trained a convolutional neural network (CNN) computer vision classifier of happy, sad, surprised, fearful, angry, disgust, and neutral expressions evoked by children.The classifier achieved a 66.9% balanced accuracy and 67.4% F1-score on the entirety of the Child Affective Facial Expression (CAFE) as well as a 79.1% balanced accuracy and 78% F1-score on CAFE Subset A, a subset containing at least 60% human agreement on emotions labels. This performance is at least 10% higher than all previously developed classifiers evaluated against CAFE, the best of which reached a 56% balanced accuracy even when combining "anger" and "disgust" into a single class.This work validates that mobile games designed for pediatric therapies can generate high volumes of domain-relevant data sets to train state-of-the-art classifiers to perform tasks helpful to precision health efforts.

    View details for DOI 10.2196/26760

    View details for PubMedID 35394438

  • Training Affective Computer Vision Models by Crowdsourcing Soft-Target Labels COGNITIVE COMPUTATION Washington, P., Kalantarian, H., Kent, J., Husic, A., Kline, A., Leblanc, E., Hou, C., Mutlu, C., Dunlap, K., Penev, Y., Stockham, N., Chrisman, B., Paskov, K., Jung, J., Voss, C., Haber, N., Wall, D. P. 2021
  • Training Affective Computer Vision Models by Crowdsourcing Soft-Target Labels. Cognitive computation Washington, P., Kalantarian, H., Kent, J., Husic, A., Kline, A., Leblanc, E., Hou, C., Mutlu, C., Dunlap, K., Penev, Y., Stockham, N., Chrisman, B., Paskov, K., Jung, J. Y., Voss, C., Haber, N., Wall, D. P. 2021; 13 (5): 1363-1373


    Emotion detection classifiers traditionally predict discrete emotions. However, emotion expressions are often subjective, thus requiring a method to handle compound and ambiguous labels. We explore the feasibility of using crowdsourcing to acquire reliable soft-target labels and evaluate an emotion detection classifier trained with these labels. We hypothesize that training with labels that are representative of the diversity of human interpretation of an image will result in predictions that are similarly representative on a disjoint test set. We also hypothesize that crowdsourcing can generate distributions which mirror those generated in a lab setting.We center our study on the Child Affective Facial Expression (CAFE) dataset, a gold standard collection of images depicting pediatric facial expressions along with 100 human labels per image. To test the feasibility of crowdsourcing to generate these labels, we used Microworkers to acquire labels for 207 CAFE images. We evaluate both unfiltered workers as well as workers selected through a short crowd filtration process. We then train two versions of a ResNet-152 neural network on soft-target CAFE labels using the original 100 annotations provided with the dataset: (1) a classifier trained with traditional one-hot encoded labels, and (2) a classifier trained with vector labels representing the distribution of CAFE annotator responses. We compare the resulting softmax output distributions of the two classifiers with a 2-sample independent t-test of L1 distances between the classifier's output probability distribution and the distribution of human labels.While agreement with CAFE is weak for unfiltered crowd workers, the filtered crowd agree with the CAFE labels 100% of the time for happy, neutral, sad and "fear + surprise", and 88.8% for "anger + disgust". While the F1-score for a one-hot encoded classifier is much higher (94.33% vs. 78.68%) with respect to the ground truth CAFE labels, the output probability vector of the crowd-trained classifier more closely resembles the distribution of human labels (t=3.2827, p=0.0014).For many applications of affective computing, reporting an emotion probability distribution that accounts for the subjectivity of human interpretation can be more useful than an absolute label. Crowdsourcing, including a sufficient filtering mechanism for selecting reliable crowd workers, is a feasible solution for acquiring soft-target labels.

    View details for DOI 10.1007/s12559-021-09936-4

    View details for PubMedID 35669554

    View details for PubMedCentralID PMC9165031

  • Integrated eye tracking on Magic Leap One during augmented reality medical simulation: a technical report BMJ SIMULATION & TECHNOLOGY ENHANCED LEARNING Caruso, T. J., Hess, O., Roy, K., Wang, E., Rodriguez, S., Palivathukal, C., Haber, N. 2021; 7 (5): 431-434
  • Crowdsourced privacy-preserved feature tagging of short home videos for machine learning ASD detection. Scientific reports Washington, P., Tariq, Q., Leblanc, E., Chrisman, B., Dunlap, K., Kline, A., Kalantarian, H., Penev, Y., Paskov, K., Voss, C., Stockham, N., Varma, M., Husic, A., Kent, J., Haber, N., Winograd, T., Wall, D. P. 2021; 11 (1): 7620


    Standard medical diagnosis of mental health conditions requires licensed experts who are increasingly outnumbered by those at risk, limiting reach. We test the hypothesis that a trustworthy crowd of non-experts can efficiently annotate behavioral features needed for accurate machine learning detection of the common childhood developmental disorder Autism Spectrum Disorder (ASD) for children under 8years old. We implement a novel process for identifying andcertifyinga trustworthy distributed workforce for video feature extraction, selecting a workforce of 102 workers from a pool of 1,107. Two previously validated ASD logistic regression classifiers, evaluated against parent-reported diagnoses, were used to assess the accuracy of the trusted crowd's ratings of unstructured home videos. A representative balanced sample (N=50 videos) of videos were evaluated with and without face box and pitch shift privacy alterations, with AUROC and AUPRC scores>0.98. With both privacy-preserving modifications, sensitivity is preserved (96.0%) while maintaining specificity (80.0%) and accuracy (88.0%) at levels comparable to prior classification methods without alterations. We find that machine learning classification from features extracted by a certified nonexpert crowd achieves high performance for ASD detection from natural home videos of the child at risk and maintains high sensitivity when privacy-preserving mechanisms are applied. These results suggest that privacy-safeguarded crowdsourced analysis of short home videos can help enable rapid and mobile machine-learning detection of developmental delays in children.

    View details for DOI 10.1038/s41598-021-87059-4

    View details for PubMedID 33828118

  • Integrated eye tracking on Magic Leap One during augmented reality medical simulation: a technical report. BMJ simulation & technology enhanced learning Caruso, T. J., Hess, O., Roy, K., Wang, E., Rodriguez, S., Palivathukal, C., Haber, N. 2021; 7 (5): 431-434


    Augmented reality (AR) has been studied as a clinical teaching tool, however eye-tracking capabilities integrated within an AR medical simulator have limited research. The recently developed Chariot Augmented Reality Medical (CHARM) simulator integrates real-time communication into a portable medical simulator. The purpose of this project was to refine the gaze-tracking capabilities of the CHARM simulator on the Magic Leap One (ML1). Adults aged 18 years and older were recruited using convenience sampling. Participants were provided with an ML1 headset that projected a hologram of a patient, bed and monitor. They were instructed via audio recording to gaze at variables in this scenario. The participant gaze targets from the ML1 output were compared with the specified gaze points from the audio recording. A priori investigators planned to iterative modifications of the eye-tracking software until a capture rate of 80% was achieved. Two consecutive participants with a capture rate less than 80% triggered software modifications and the project concluded after three consecutive participants' capture rates were greater than 80%. Thirteen participants were included in the study. Eye-tracking concordance was less than 80% reliable in the first 10 participants. The investigators hypothesised that the eye movement detection threshold was too sensitive, thus the algorithm was adjusted to reduce noise. The project concluded after the final three participants' gaze capture rates were 80%, 80% and 80.1%, respectively. This report suggests that eye-tracking technology can be reliably used with the ML1 enabled with CHARM simulator software.

    View details for DOI 10.1136/bmjstel-2020-000782

    View details for PubMedID 35515734

    View details for PubMedCentralID PMC8936533

  • Selection of trustworthy crowd workers for telemedical diagnosis of pediatric autism spectrum disorder. Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing Washington, P., Leblanc, E., Dunlap, K., Penev, Y., Varma, M., Jung, J., Chrisman, B., Sun, M. W., Stockham, N., Paskov, K. M., Kalantarian, H., Voss, C., Haber, N., Wall, D. P. 2021; 26: 14–25


    Crowd-powered telemedicine has the potential to revolutionize healthcare, especially during times that require remote access to care. However, sharing private health data with strangers from around the world is not compatible with data privacy standards, requiring a stringent filtration process to recruit reliable and trustworthy workers who can go through the proper training and security steps. The key challenge, then, is to identify capable, trustworthy, and reliable workers through high-fidelity evaluation tasks without exposing any sensitive patient data during the evaluation process. We contribute a set of experimentally validated metrics for assessing the trustworthiness and reliability of crowd workers tasked with providing behavioral feature tags to unstructured videos of children with autism and matched neurotypical controls. The workers are blinded to diagnosis and blinded to the goal of using the features to diagnose autism. These behavioral labels are fed as input to a previously validated binary logistic regression classifier for detecting autism cases using categorical feature vectors. While the metrics do not incorporate any ground truth labels of child diagnosis, linear regression using the 3 correlative metrics as input can predict the mean probability of the correct class of each worker with a mean average error of 7.51% for performance on the same set of videos and 10.93% for performance on a distinct balanced video set with different children. These results indicate that crowd workers can be recruited for performance based largely on behavioral metrics on a crowdsourced task, enabling an affordable way to filter crowd workforces into a trustworthy and reliable diagnostic workforce.

    View details for PubMedID 33691000

  • Selection of trustworthy crowd workers for telemedical diagnosis of pediatric autism spectrum disorder Washington, P., Leblanc, E., Dunlap, K., Penev, Y., Varma, M., Jung, J., Chrisman, B., Sun, M., Stockham, N., Paskov, K., Kalantarian, H., Voss, C., Haber, N., Wall, D. P., Altman, R. B., Dunker, A. K., Hunter, L., Ritchie, M. D., Murray, T., Klein, T. E. WORLD SCIENTIFIC PUBL CO PTE LTD. 2021: 14-25
  • Precision Telemedicine through Crowdsourced Machine Learning: Testing Variability of Crowd Workers for Video-Based Autism Feature Recognition. Journal of personalized medicine Washington, P., Leblanc, E., Dunlap, K., Penev, Y., Kline, A., Paskov, K., Sun, M. W., Chrisman, B., Stockham, N., Varma, M., Voss, C., Haber, N., Wall, D. P. 2020; 10 (3)


    Mobilized telemedicine is becoming a key, and even necessary, facet of both precision health and precision medicine. In this study, we evaluate the capability and potential of a crowd of virtual workers-defined as vetted members of popular crowdsourcing platforms-to aid in the task of diagnosing autism. We evaluate workers when crowdsourcing the task of providing categorical ordinal behavioral ratings to unstructured public YouTube videos of children with autism and neurotypical controls. To evaluate emerging patterns that are consistent across independent crowds, we target workers from distinct geographic loci on two crowdsourcing platforms: an international group of workers on Amazon Mechanical Turk (MTurk) (N = 15) and Microworkers from Bangladesh (N = 56), Kenya (N = 23), and the Philippines (N = 25). We feed worker responses as input to a validated diagnostic machine learning classifier trained on clinician-filled electronic health records. We find that regardless of crowd platform or targeted country, workers vary in the average confidence of the correct diagnosis predicted by the classifier. The best worker responses produce a mean probability of the correct class above 80% and over one standard deviation above 50%, accuracy and variability on par with experts according to prior studies. There is a weak correlation between mean time spent on task and mean performance (r = 0.358, p = 0.005). These results demonstrate that while the crowd can produce accurate diagnoses, there are intrinsic differences in crowdworker ability to rate behavioral features. We propose a novel strategy for recruitment of crowdsourced workers to ensure high quality diagnostic evaluations of autism, and potentially many other pediatric behavioral health conditions. Our approach represents a viable step in the direction of crowd-based approaches for more scalable and affordable precision medicine.

    View details for DOI 10.3390/jpm10030086

    View details for PubMedID 32823538

  • Toward Continuous Social Phenotyping: Analyzing Gaze Patterns in an Emotion Recognition Task for Children With Autism Through Wearable Smart Glasses. Journal of medical Internet research Nag, A., Haber, N., Voss, C., Tamura, S., Daniels, J., Ma, J., Chiang, B., Ramachandran, S., Schwartz, J., Winograd, T., Feinstein, C., Wall, D. P. 2020; 22 (4): e13810


    BACKGROUND: Several studies have shown that facial attention differs in children with autism. Measuring eye gaze and emotion recognition in children with autism is challenging, as standard clinical assessments must be delivered in clinical settings by a trained clinician. Wearable technologies may be able to bring eye gaze and emotion recognition into natural social interactions and settings.OBJECTIVE: This study aimed to test: (1) the feasibility of tracking gaze using wearable smart glasses during a facial expression recognition task and (2) the ability of these gaze-tracking data, together with facial expression recognition responses, to distinguish children with autism from neurotypical controls (NCs).METHODS: We compared the eye gaze and emotion recognition patterns of 16 children with autism spectrum disorder (ASD) and 17 children without ASD via wearable smart glasses fitted with a custom eye tracker. Children identified static facial expressions of images presented on a computer screen along with nonsocial distractors while wearing Google Glass and the eye tracker. Faces were presented in three trials, during one of which children received feedback in the form of the correct classification. We employed hybrid human-labeling and computer vision-enabled methods for pupil tracking and world-gaze translation calibration. We analyzed the impact of gaze and emotion recognition features in a prediction task aiming to distinguish children with ASD from NC participants.RESULTS: Gaze and emotion recognition patterns enabled the training of a classifier that distinguished ASD and NC groups. However, it was unable to significantly outperform other classifiers that used only age and gender features, suggesting that further work is necessary to disentangle these effects.CONCLUSIONS: Although wearable smart glasses show promise in identifying subtle differences in gaze tracking and emotion recognition patterns in children with and without ASD, the present form factor and data do not allow for these differences to be reliably exploited by machine learning systems. Resolving these challenges will be an important step toward continuous tracking of the ASD phenotype.

    View details for DOI 10.2196/13810

    View details for PubMedID 32319961

  • Feature Selection and Dimension Reduction of Social Autism Data. Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing Washington, P. n., Paskov, K. M., Kalantarian, H. n., Stockham, N. n., Voss, C. n., Kline, A. n., Patnaik, R. n., Chrisman, B. n., Varma, M. n., Tariq, Q. n., Dunlap, K. n., Schwartz, J. n., Haber, N. n., Wall, D. P. 2020; 25: 707–18


    Autism Spectrum Disorder (ASD) is a complex neuropsychiatric condition with a highly heterogeneous phenotype. Following the work of Duda et al., which uses a reduced feature set from the Social Responsiveness Scale, Second Edition (SRS) to distinguish ASD from ADHD, we performed item-level question selection on answers to the SRS to determine whether ASD can be distinguished from non-ASD using a similarly small subset of questions. To explore feature redundancies between the SRS questions, we performed filter, wrapper, and embedded feature selection analyses. To explore the linearity of the SRS-related ASD phenotype, we then compressed the 65-question SRS into low-dimension representations using PCA, t-SNE, and a denoising autoencoder. We measured the performance of a multilayer perceptron (MLP) classifier with the top-ranking questions as input. Classification using only the top-rated question resulted in an AUC of over 92% for SRS-derived diagnoses and an AUC of over 83% for dataset-specific diagnoses. High redundancy of features have implications towards replacing the social behaviors that are targeted in behavioral diagnostics and interventions, where digital quantification of certain features may be obfuscated due to privacy concerns. We similarly evaluated the performance of an MLP classifier trained on the low-dimension representations of the SRS, finding that the denoising autoencoder achieved slightly higher performance than the PCA and t-SNE representations.

    View details for PubMedID 31797640

  • Feature Selection and Dimension Reduction of Social Autism Data Washington, P., Paskov, K., Kalantarian, H., Stockham, N., Voss, C., Kline, A., Patnaik, R., Chrisman, B., Varma, M., Tariq, Q., Dunlap, K., Schwartz, J., Haber, N., Wall, D. P., Altman, R. B., Dunker, A. K., Hunter, L., Ritchie, M. D., Murray, T., Klein, T. E. WORLD SCIENTIFIC PUBL CO PTE LTD. 2020: 707-718
  • Data-Driven Diagnostics and the Potential of Mobile Artificial Intelligence for Digital Therapeutic Phenotyping in Computational Psychiatry. Biological psychiatry. Cognitive neuroscience and neuroimaging Washington, P., Park, N., Srivastava, P., Voss, C., Kline, A., Varma, M., Tariq, Q., Kalantarian, H., Schwartz, J., Patnaik, R., Chrisman, B., Stockham, N., Paskov, K., Haber, N., Wall, D. P. 2019


    Data science and digital technologies have the potential to transform diagnostic classification. Digital technologies enable the collection of big data, and advances in machine learning and artificial intelligence enable scalable, rapid, and automated classification of medical conditions. In this review, we summarize and categorize various data-driven methods for diagnostic classification. In particular, we focus on autism as an example of a challenging disorder due to its highly heterogeneous nature. We begin by describing the frontier of data science methods for the neuropsychiatry of autism. We discuss early signs of autism as defined by existing pen-and-paper-based diagnostic instruments and describe data-driven feature selection techniques for determining the behaviors that are most salient for distinguishing children with autism from neurologically typical children. We then describe data-driven detection techniques, particularly computer vision and eye tracking, that provide a means of quantifying behavioral differences between cases and controls. We also describe methods of preserving the privacy of collected videos and prior efforts of incorporating humans in the diagnostic loop. Finally, we summarize existing digital therapeutic interventions that allow for data capture and longitudinal outcome tracking as the diagnosis moves along a positive trajectory. Digital phenotyping of autism is paving the way for quantitative psychiatry more broadly and will set the stage for more scalable, accessible, and precise diagnostic techniques in the field.

    View details for DOI 10.1016/j.bpsc.2019.11.015

    View details for PubMedID 32085921

  • SUPERPOWER GLASS MOBILE COMPUTING AND COMMUNICATIONS REVIEW Kline, A., Voss, C., Washington, P., Haber, N., Schwartz, J., Tariq, Q., Winograd, T., Feinstein, C., Wall, D. P. 2019; 23 (2): 35–38
  • Validity of Online Screening for Autism: Crowdsourcing Study Comparing Paid and Unpaid Diagnostic Tasks. Journal of medical Internet research Washington, P., Kalantarian, H., Tariq, Q., Schwartz, J., Dunlap, K., Chrisman, B., Varma, M., Ning, M., Kline, A., Stockham, N., Paskov, K., Voss, C., Haber, N., Wall, D. P. 2019; 21 (5): e13668


    BACKGROUND: Obtaining a diagnosis of neuropsychiatric disorders such as autism requires long waiting times that can exceed a year and can be prohibitively expensive. Crowdsourcing approaches may provide a scalable alternative that can accelerate general access to care and permit underserved populations to obtain an accurate diagnosis.OBJECTIVE: We aimed to perform a series of studies to explore whether paid crowd workers on Amazon Mechanical Turk (AMT) and citizen crowd workers on a public website shared on social media can provide accurate online detection of autism, conducted via crowdsourced ratings of short home video clips.METHODS: Three online studies were performed: (1) a paid crowdsourcing task on AMT (N=54) where crowd workers were asked to classify 10 short video clips of children as "Autism" or "Not autism," (2) a more complex paid crowdsourcing task (N=27) with only those raters who correctly rated ≥8 of the 10 videos during the first study, and (3) a public unpaid study (N=115) identical to the first study.RESULTS: For Study 1, the mean score of the participants who completed all questions was 7.50/10 (SD 1.46). When only analyzing the workers who scored ≥8/10 (n=27/54), there was a weak negative correlation between the time spent rating the videos and the sensitivity (rho=-0.44, P=.02). For Study 2, the mean score of the participants rating new videos was 6.76/10 (SD 0.59). The average deviation between the crowdsourced answers and gold standard ratings provided by two expert clinical research coordinators was 0.56, with an SD of 0.51 (maximum possible SD is 3). All paid crowd workers who scored 8/10 in Study 1 either expressed enjoyment in performing the task in Study 2 or provided no negative comments. For Study 3, the mean score of the participants who completed all questions was 6.67/10 (SD 1.61). There were weak correlations between age and score (r=0.22, P=.014), age and sensitivity (r=-0.19, P=.04), number of family members with autism and sensitivity (r=-0.195, P=.04), and number of family members with autism and precision (r=-0.203, P=.03). A two-tailed t test between the scores of the paid workers in Study 1 and the unpaid workers in Study 3 showed a significant difference (P<.001).CONCLUSIONS: Many paid crowd workers on AMT enjoyed answering screening questions from videos, suggesting higher intrinsic motivation to make quality assessments. Paid crowdsourcing provides promising screening assessments of pediatric autism with an average deviation <20% from professional gold standard raters, which is potentially a clinically informative estimate for parents. Parents of children with autism likely overfit their intuition to their own affected child. This work provides preliminary demographic data on raters who may have higher ability to recognize and measure features of autism across its wide range of phenotypic manifestations.

    View details for DOI 10.2196/13668

    View details for PubMedID 31124463

  • Effect of Wearable Digital Intervention for Improving Socialization in Children With Autism Spectrum Disorder A Randomized Clinical Trial JAMA PEDIATRICS Voss, C., Schwartz, J., Daniels, J., Kline, A., Haber, N., Washington, P., Tariq, Q., Robinson, T. N., Desai, M., Phillips, J. M., Feinstein, C., Winograd, T., Wall, D. P. 2019; 173 (5): 446–54
  • Effect of Wearable Digital Intervention for Improving Socialization in Children With Autism Spectrum Disorder: A Randomized Clinical Trial. JAMA pediatrics Voss, C., Schwartz, J., Daniels, J., Kline, A., Haber, N., Washington, P., Tariq, Q., Robinson, T. N., Desai, M., Phillips, J. M., Feinstein, C., Winograd, T., Wall, D. P. 2019


    Importance: Autism behavioral therapy is effective but expensive and difficult to access. While mobile technology-based therapy can alleviate wait-lists and scale for increasing demand, few clinical trials exist to support its use for autism spectrum disorder (ASD) care.Objective: To evaluate the efficacy of Superpower Glass, an artificial intelligence-driven wearable behavioral intervention for improving social outcomes of children with ASD.Design, Setting, and Participants: A randomized clinical trial in which participants received the Superpower Glass intervention plus standard of care applied behavioral analysis therapy and control participants received only applied behavioral analysis therapy. Assessments were completed at the Stanford University Medical School, and enrolled participants used the Superpower Glass intervention in their homes. Children aged 6 to 12 years with a formal ASD diagnosis who were currently receiving applied behavioral analysis therapy were included. Families were recruited between June 2016 and December 2017. The first participant was enrolled on November 1, 2016, and the last appointment was completed on April 11, 2018. Data analysis was conducted between April and October 2018.Interventions: The Superpower Glass intervention, deployed via Google Glass (worn by the child) and a smartphone app, promotes facial engagement and emotion recognition by detecting facial expressions and providing reinforcing social cues. Families were asked to conduct 20-minute sessions at home 4 times per week for 6 weeks.Main Outcomes and Measures: Four socialization measures were assessed using an intention-to-treat analysis with a Bonferroni test correction.Results: Overall, 71 children (63 boys [89%]; mean [SD] age, 8.38 [2.46] years) diagnosed with ASD were enrolled (40 [56.3%] were randomized to treatment, and 31 (43.7%) were randomized to control). Children receiving the intervention showed significant improvements on the Vineland Adaptive Behaviors Scale socialization subscale compared with treatment as usual controls (mean [SD] treatment impact, 4.58 [1.62]; P=.005). Positive mean treatment effects were also found for the other 3 primary measures but not to a significance threshold of P=.0125.Conclusions and Relevance: The observed 4.58-point average gain on the Vineland Adaptive Behaviors Scale socialization subscale is comparable with gains observed with standard of care therapy. To our knowledge, this is the first randomized clinical trial to demonstrate efficacy of a wearable digital intervention to improve social behavior of children with ASD. The intervention reinforces facial engagement and emotion recognition, suggesting either or both could be a mechanism of action driving the observed improvement. This study underscores the potential of digital home therapy to augment the standard of care.Trial Registration: identifier: NCT03569176.

    View details for PubMedID 30907929

  • Addendum to the Acknowledgements: Validity of Online Screening for Autism: Crowdsourcing Study Comparing Paid and Unpaid Diagnostic Tasks. Journal of medical Internet research Washington, P. n., Kalantarian, H. n., Tariq, Q. n., Schwartz, J. n., Dunlap, K. n., Chrisman, B. n., Varma, M. n., Ning, M. n., Kline, A. n., Stockham, N. n., Paskov, K. n., Voss, C. n., Haber, N. n., Wall, D. P. 2019; 21 (6): e14950


    [This corrects the article DOI: 10.2196/13668.].

    View details for DOI 10.2196/14950

    View details for PubMedID 31250828

  • The Potential for Machine Learning-Based Wearables to Improve Socialization in Teenagers and Adults With Autism Spectrum Disorder-Reply. JAMA pediatrics Voss, C. n., Haber, N. n., Wall, D. P. 2019

    View details for DOI 10.1001/jamapediatrics.2019.2969

    View details for PubMedID 31498377

  • Guess What?: Towards Understanding Autism from Structured Video Using Facial Affect. Journal of healthcare informatics research Kalantarian, H., Washington, P., Schwartz, J., Daniels, J., Haber, N., Wall, D. P. 2019; 3: 43–66


    Autism Spectrum Disorder (ASD) is a condition affecting an estimated 1 in 59 children in the United States. Due to delays in diagnosis and imbalances in coverage, it is necessary to develop new methods of care delivery that can appropriately empower children and caregivers by capitalizing on mobile tools and wearable devices for use outside of clinical settings. In this paper, we present a mobile charades-style game, Guess What?, used for the acquisition of structured video from children with ASD for behavioral disease research. We then apply face tracking and emotion recognition algorithms to videos acquired through Guess What? game play. By analyzing facial affect in response to various prompts, we demonstrate that engagement and facial affect can be quantified and measured using real-time image processing algorithms: an important first-step for future therapies, at-home screenings, and outcome measures based on home video. Our study of eight subjects demonstrates the efficacy of this system for deriving highly emotive structured video from children with ASD through an engaging gamified mobile platform, while revealing the most efficacious prompts and categories for producing diverse emotion in participants.

    View details for DOI 10.1007/s41666-018-0034-9

    View details for PubMedID 33313475

  • Exploratory study examining the at-home feasibility of a wearable tool for social-affective learning in children with autism NPJ DIGITAL MEDICINE Daniels, J., Schwartz, J. N., Voss, C., Haber, N., Fazel, A., Kline, A., Washington, P., Feinstein, C., Winograd, T., Wall, D. P. 2018; 1
  • Exploratory study examining the at-home feasibility of a wearable tool for social-affective learning in children with autism. NPJ digital medicine Daniels, J., Schwartz, J. N., Voss, C., Haber, N., Fazel, A., Kline, A., Washington, P., Feinstein, C., Winograd, T., Wall, D. P. 2018; 1: 32


    Although standard behavioral interventions for autism spectrum disorder (ASD) are effective therapies for social deficits, they face criticism for being time-intensive and overdependent on specialists. Earlier starting age of therapy is a strong predictor of later success, but waitlists for therapies can be 18 months long. To address these complications, we developed Superpower Glass, a machine-learning-assisted software system that runs on Google Glass and an Android smartphone, designed for use during social interactions. This pilot exploratory study examines our prototype tool's potential for social-affective learning for children with autism. We sent our tool home with 14 families and assessed changes from intake to conclusion through the Social Responsiveness Scale (SRS-2), a facial affect recognition task (EGG), and qualitative parent reports. A repeated-measures one-way ANOVA demonstrated a decrease in SRS-2 total scores by an average 7.14 points (F(1,13) = 33.20, p = <.001, higher scores indicate higher ASD severity). EGG scores also increased by an average 9.55 correct responses (F(1,10) = 11.89, p = <.01). Parents reported increased eye contact and greater social acuity. This feasibility study supports using mobile technologies for potential therapeutic purposes.

    View details for DOI 10.1038/s41746-018-0035-3

    View details for PubMedID 31304314

    View details for PubMedCentralID PMC6550272

  • Feasibility Testing of a Wearable Behavioral Aid for Social Learning in Children with Autism APPLIED CLINICAL INFORMATICS Daniels, J., Haber, N., Voss, C., Schwartz, J., Tamura, S., Fazel, A., Kline, A., Washington, P., Phillips, J., Winograd, T., Feinstein, C., Wall, D. P. 2018; 9 (1): 129–40


    Recent advances in computer vision and wearable technology have created an opportunity to introduce mobile therapy systems for autism spectrum disorders (ASD) that can respond to the increasing demand for therapeutic interventions; however, feasibility questions must be answered first.We studied the feasibility of a prototype therapeutic tool for children with ASD using Google Glass, examining whether children with ASD would wear such a device, if providing the emotion classification will improve emotion recognition, and how emotion recognition differs between ASD participants and neurotypical controls (NC).We ran a controlled laboratory experiment with 43 children: 23 with ASD and 20 NC. Children identified static facial images on a computer screen with one of 7 emotions in 3 successive batches: the first with no information about emotion provided to the child, the second with the correct classification from the Glass labeling the emotion, and the third again without emotion information. We then trained a logistic regression classifier on the emotion confusion matrices generated by the two information-free batches to predict ASD versus NC.All 43 children were comfortable wearing the Glass. ASD and NC participants who completed the computer task with Glass providing audible emotion labeling (n = 33) showed increased accuracies in emotion labeling, and the logistic regression classifier achieved an accuracy of 72.7%. Further analysis suggests that the ability to recognize surprise, fear, and neutrality may distinguish ASD cases from NC.This feasibility study supports the utility of a wearable device for social affective learning in ASD children and demonstrates subtle differences in how ASD and NC children perform on an emotion recognition task.

    View details for DOI 10.1055/s-0038-1626727

    View details for Web of Science ID 000428690000006

    View details for PubMedID 29466819

    View details for PubMedCentralID PMC5821509

  • Learning to Play With Intrinsically-Motivated, Self-Aware Agents Haber, N., Mrowca, D., Wang, S., Li Fei-Fei, Yamins, D. K., Bengio, S., Wallach, H., Larochelle, H., Grauman, K., CesaBianchi, N., Garnett, R. NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2018
  • Flexible Neural Representation for Physics Prediction Mrowca, D., Zhuang, C., Wang, E., Haber, N., Li Fei-Fei, Tenenbaum, J. B., Yamins, D. K., Bengio, S., Wallach, H., Larochelle, H., Grauman, K., CesaBianchi, N., Garnett, R. NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2018
  • Sparsifying machine learning models identify stable subsets of predictive features for behavioral detection of autism MOLECULAR AUTISM Levy, S., Duda, M., Haber, N., Wall, D. P. 2017; 8: 65


    Autism spectrum disorder (ASD) diagnosis can be delayed due in part to the time required for administration of standard exams, such as the Autism Diagnostic Observation Schedule (ADOS). Shorter and potentially mobilized approaches would help to alleviate bottlenecks in the healthcare system. Previous work using machine learning suggested that a subset of the behaviors measured by ADOS can achieve clinically acceptable levels of accuracy. Here we expand on this initial work to build sparse models that have higher potential to generalize to the clinical population.We assembled a collection of score sheets for two ADOS modules, one for children with phrased speech (Module 2; 1319 ASD cases, 70 controls) and the other for children with verbal fluency (Module 3; 2870 ASD cases, 273 controls). We used sparsity/parsimony enforcing regularization techniques in a nested cross validation grid search to select features for 17 unique supervised learning models, encoding missing values as additional indicator features. We augmented our feature sets with gender and age to train minimal and interpretable classifiers capable of robust detection of ASD from non-ASD.By applying 17 unique supervised learning methods across 5 classification families tuned for sparse use of features and to be within 1 standard error of the optimal model, we find reduced sets of 10 and 5 features used in a majority of models. We tested the performance of the most interpretable of these sparse models, including Logistic Regression with L2 regularization or Linear SVM with L1 regularization. We obtained an area under the ROC curve of 0.95 for ADOS Module 3 and 0.93 for ADOS Module 2 with less than or equal to 10 features.The resulting models provide improved stability over previous machine learning efforts to minimize the time complexity of autism detection due to regularization and a small parameter space. These robustness techniques yield classifiers that are sparse, interpretable and that have potential to generalize to alternative modes of autism screening, diagnosis and monitoring, possibly including analysis of short home videos.

    View details for PubMedID 29270283

  • Crowdsourced validation of a machine-learning classification system for autism and ADHD. Translational psychiatry Duda, M., Haber, N., Daniels, J., Wall, D. P. 2017; 7 (5)


    Autism spectrum disorder (ASD) and attention deficit hyperactivity disorder (ADHD) together affect >10% of the children in the United States, but considerable behavioral overlaps between the two disorders can often complicate differential diagnosis. Currently, there is no screening test designed to differentiate between the two disorders, and with waiting times from initial suspicion to diagnosis upwards of a year, methods to quickly and accurately assess risk for these and other developmental disorders are desperately needed. In a previous study, we found that four machine-learning algorithms were able to accurately (area under the curve (AUC)>0.96) distinguish ASD from ADHD using only a small subset of items from the Social Responsiveness Scale (SRS). Here, we expand upon our prior work by including a novel crowdsourced data set of responses to our predefined top 15 SRS-derived questions from parents of children with ASD (n=248) or ADHD (n=174) to improve our model's capability to generalize to new, 'real-world' data. By mixing these novel survey data with our initial archival sample (n=3417) and performing repeated cross-validation with subsampling, we created a classification algorithm that performs with AUC=0.89±0.01 using only 15 questions.

    View details for DOI 10.1038/tp.2017.86

    View details for PubMedID 28509905

  • The Feynman Propagator on Perturbations of Minkowski Space COMMUNICATIONS IN MATHEMATICAL PHYSICS Gell-Redman, J., Haber, N., Vasy, A. 2016; 342 (1): 333-384
  • Use of machine learning for behavioral distinction of autism and ADHD. Translational psychiatry Duda, M., Ma, R., Haber, N., Wall, D. P. 2016; 6


    Although autism spectrum disorder (ASD) and attention deficit hyperactivity disorder (ADHD) continue to rise in prevalence, together affecting >10% of today's pediatric population, the methods of diagnosis remain subjective, cumbersome and time intensive. With gaps upward of a year between initial suspicion and diagnosis, valuable time where treatments and behavioral interventions could be applied is lost as these disorders remain undetected. Methods to quickly and accurately assess risk for these, and other, developmental disorders are necessary to streamline the process of diagnosis and provide families access to much-needed therapies sooner. Using forward feature selection, as well as undersampling and 10-fold cross-validation, we trained and tested six machine learning models on complete 65-item Social Responsiveness Scale score sheets from 2925 individuals with either ASD (n=2775) or ADHD (n=150). We found that five of the 65 behaviors measured by this screening tool were sufficient to distinguish ASD from ADHD with high accuracy (area under the curve=0.965). These results support the hypotheses that (1) machine learning can be used to discern between autism and ADHD with high accuracy and (2) this distinction can be made using a small number of commonly measured behaviors. Our findings show promise for use as an electronically administered, caregiver-directed resource for preliminary risk evaluation and/or pre-clinical screening and triage that could help to speed the diagnosis of these disorders.

    View details for DOI 10.1038/tp.2015.221

    View details for PubMedID 26859815

    View details for PubMedCentralID PMC4872425

  • A Practical Approach to Real-Time Neutral Feature Subtraction for Facial Expression Recognition Haber, N., Voss, C., Fazel, A., Winograd, T., Wall, D. P., IEEE IEEE. 2016
  • Propagation of singularities around a Lagrangian submanifold of radial points Bulletin de la SMF Haber, N., Vasy, A. 2015
  • A Normal Form Around a Lagrangian Submanifold of Radial Points INTERNATIONAL MATHEMATICS RESEARCH NOTICES Haber, N. 2014: 4804-4821
  • The Feynman propagator on perturbations of minkowski space. Gell-Redman, J., Haber, N., Vasy, A. 2014
  • Microlocal analysis of Lagrangian submanifolds of radial points Stanford University Thesis Haber, N. 2013
  • Color-Permuting Automorphisms of Cayley Graphs Congressus Numerantium Albert, M., Bratz, J., Cahn, P., Fargus, T., Haber, N., McMahon, E., Smith, J., Tekansik, S. 2008; 190: 161-177