Bio
Jiajun Wu is an Assistant Professor of Computer Science and, by courtesy, of Psychology at Stanford University, working on computer vision, machine learning, robotics, and computational cognitive science. Before joining Stanford, he was a Visiting Faculty Researcher at Google Research. He received his PhD in Electrical Engineering and Computer Science from the Massachusetts Institute of Technology. Wu's research has been recognized through the Young Investigator Programs (YIP) by ONR and by AFOSR, the NSF CAREER award, the Okawa research grant, the AI's 10 to Watch by IEEE Intelligent Systems, paper awards and finalists at ICCV, CVPR, SIGGRAPH Asia, ICRA, CoRL, and IROS, dissertation awards from ACM, AAAI, and MIT, the 2020 Samsung AI Researcher of the Year, and faculty research awards from Google, J.P. Morgan, Samsung, Amazon, and Meta.
Academic Appointments
-
Assistant Professor, Computer Science
-
Member, Bio-X
-
Faculty Affiliate, Institute for Human-Centered Artificial Intelligence (HAI)
-
Member, Wu Tsai Neurosciences Institute
Honors & Awards
-
Research Scholar Award, Google (2025)
-
Best Paper Award Finalist, ICRA, IEEE (2025)
-
Research Grant, Okawa Foundation (2024)
-
CAREER Award, NSF (2024)
-
Young Investigator Program (YIP), ONR (2024)
-
AI's 10 to Watch, IEEE Intelligent Systems (2024)
-
Best Paper Award, ICRA, IEEE (2024)
-
Innovators Under 35 Asia Pacific, MIT Technology Review (2024)
-
Young Investigator Program (YIP), AFOSR (2023)
-
Best Paper Award, SIGGRAPH Asia, ACM (2023)
-
Best Systems Paper Award, CoRL (2023)
-
Best Paper Award Finalist, ICCV, IEEE/CVF (2023)
-
Best Paper Award Candidate, CVPR, IEEE/CVF (2023)
-
Global Research Outreach (GRO) Award, Samsung (2023)
-
New Faculty Highlights, AAAI (2023)
-
Best Paper Award Nominee, CoRL (2022)
-
Faculty Research Award, J.P. Morgan (2022)
-
30 Under 30, Science, Forbes (2022)
-
Early Career Professor Award Finalist, Agilent (2022)
-
Research Award, Meta (2021)
-
Research Award, Amazon (2021)
-
AI Researcher of the Year, Samsung (2020)
-
Global Research Outreach (GRO) Award, Samsung (2020)
-
George M. Sprowls PhD Thesis Award in Artificial Intelligence and Decision-Making, MIT (2020)
-
Doctoral Dissertation Award Honorable Mention, ACM (2019)
-
Dissertation Award, AAAI/ACM SIGAI (2019)
-
PhD Fellowship, Facebook (2017--2019)
-
Best Paper Award on Cognitive Robotics, IROS, IEEE/RSJ (2018)
-
PhD Fellowship, Samsung (2016--2017)
-
Graduate Fellowship, Nvidia (2016--2017)
-
Research Fellowship, Adobe (2015)
-
Edwin S. Webster Fellowship, MIT (2014)
Program Affiliations
-
Stanford SystemX Alliance
-
Symbolic Systems Program
Professional Education
-
Ph.D., MIT, EECS
-
S.M., MIT, EECS
2025-26 Courses
- Computer Graphics in the Era of AI
CS 348I (Spr) - Minds and Machines
CS 24, LINGUIST 35, PHIL 99, PSYCH 35, SYMSYS 1, SYMSYS 200 (Spr) -
Independent Studies (16)
- Advanced Reading and Research
CS 499 (Aut, Win, Spr) - Advanced Reading and Research
CS 499P (Aut, Win, Spr) - Curricular Practical Training
CS 390A (Aut, Win, Spr) - Curricular Practical Training
CS 390B (Aut, Win, Spr) - Curricular Practical Training
CS 390C (Aut, Win, Spr) - Graduate Research
NEPR 399 (Aut, Win, Spr, Sum) - Independent Project
CS 399 (Aut, Win, Spr) - Independent Project
CS 399P (Aut, Win, Spr) - Independent Study
SYMSYS 296 (Aut, Win, Spr) - Independent Work
CS 199 (Aut, Win, Spr) - Independent Work
CS 199P (Aut, Win, Spr) - Part-time Curricular Practical Training
CS 390D (Aut, Win, Spr) - Programming Service Project
CS 192 (Aut, Win, Spr) - Senior Project
CS 191 (Aut, Win, Spr) - Supervised Undergraduate Research
CS 195 (Aut, Win, Spr) - Writing Intensive Senior Research Project
CS 191W (Aut, Win, Spr)
- Advanced Reading and Research
-
Prior Year Courses
2024-25 Courses
- Minds and Machines
CS 24, LINGUIST 35, PHIL 99, PSYCH 35, SYMSYS 1, SYMSYS 200 (Win)
2023-24 Courses
- Computer Graphics in the Era of AI
CS 348I (Win) - Minds and Machines
CS 24, LINGUIST 35, PHIL 99, PSYCH 35, SYMSYS 1, SYMSYS 200 (Win)
2022-23 Courses
- Minds and Machines
CS 24, LINGUIST 35, PHIL 99, PSYCH 35, SYMSYS 1, SYMSYS 200 (Aut)
- Minds and Machines
Stanford Advisees
-
Doctoral Dissertation Reader (AC)
Joao Araujo, Hyunwoo Gu, Wanhee Lee, Keenon Werling, Josiah Wong -
Postdoctoral Faculty Sponsor
Huang Huang, Yuhe Zhang -
Doctoral Dissertation Advisor (AC)
Joy Hsu, Yunzhi Zhang -
Master's Program Advisor
Aditya Bora, Willy Chan, Tim Chen, Yudong Chen, Abhinav Chinta, Vijay Daita, Koren Gilbai, Abel John, Akaash Kolluri, Anubha Mahajan, Ashley Raigosa, Eleanor Sigrest, Bhavna Sud, Tia Vasudeva, Matthew Vilaysack, Chuyi Zhang, Fangjun Zhou -
Doctoral Dissertation Co-Advisor (AC)
Eric Chan, Cristobal Eyzaguirre, Chaitanya Patel, Kyle Sargent, Alexa Tartaglini -
Doctoral (Program)
Chen Geng, Joy Hsu, Zizhang Li, Stephen Tian, Koven Yu, Yanjie Ze, Yunzhi Zhang
All Publications
-
A review of learning-based dynamics models for robotic manipulation.
Science robotics
2025; 10 (106): eadt1497
Abstract
Dynamics models that predict the effects of physical interactions are essential for planning and control in robotic manipulation. Although models based on physical principles often generalize well, they typically require full-state information, which can be difficult or impossible to extract from perception data in complex, real-world scenarios. Learning-based dynamics models provide an alternative by deriving state transition functions purely from perceived interaction data, enabling the capture of complex, hard-to-model factors and predictive uncertainty and accelerating simulations that are often too slow for real-time control. Recent successes in this field have demonstrated notable advancements in robot capabilities, including long-horizon manipulation of deformable objects, granular materials, and complex multiobject interactions such as stowing and packing. A crucial aspect of these investigations is the choice of state representation, which determines the inductive biases in the learning system for reduced-order modeling of scene dynamics. This article provides a timely and comprehensive review of current techniques and trade-offs in designing learned dynamics models, highlighting their role in advancing robot capabilities through integration with state estimation and control and identifying critical research gaps for future exploration.
View details for DOI 10.1126/scirobotics.adt1497
View details for PubMedID 40961212
-
Physical scene understanding
AI MAGAZINE
2024
View details for DOI 10.1002/aaai.12148
View details for Web of Science ID 001158170300001
-
Neurosymbolic Models for Computer Graphics
COMPUTER GRAPHICS FORUM
2023; 42 (2): 545-568
View details for DOI 10.1111/cgf.14775
View details for Web of Science ID 001000062600040
-
REALIMPACT: A Dataset of Impact Sound Fields for Real Objects
IEEE COMPUTER SOC. 2023: 1516-1525
View details for DOI 10.1109/CVPR52729.2023.00152
View details for Web of Science ID 001058542601079
-
RoboCook: Long-Horizon Elasto-Plastic Object Manipulation with Diverse Tools
edited by Tan, J., Toussaint, M., Darvish, K.
JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2023
View details for Web of Science ID 001221201500029
-
Seeing a Rose in Five Thousand Ways
IEEE COMPUTER SOC. 2023: 962-971
View details for DOI 10.1109/CVPR52729.2023.00099
View details for Web of Science ID 001058542601026
-
OBJECTFOLDER 2.0: A Multisensory Object Dataset for Sim2Real Transfer
IEEE COMPUTER SOC. 2022: 10588-10598
View details for DOI 10.1109/CVPR52688.2022.01034
View details for Web of Science ID 000870759103065
-
3D Shape Generation and Completion through Point-Voxel Diffusion
IEEE. 2021: 5806-5815
View details for DOI 10.1109/ICCV48922.2021.00577
View details for Web of Science ID 000797698906004
-
Visual Dynamics: Stochastic Future Generation via Layered Cross Convolutional Networks
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
2019; 41 (9): 2236–50
Abstract
We study the problem of synthesizing a number of likely future frames from a single input image. In contrast to traditional methods that have tackled this problem in a deterministic or non-parametric way, we propose to model future frames in a probabilistic manner. Our probabilistic model makes it possible for us to sample and synthesize many possible future frames from a single input image. To synthesize realistic movement of objects, we propose a novel network structure, namely a Cross Convolutional Network; this network encodes image and motion information as feature maps and convolutional kernels, respectively. In experiments, our model performs well on synthetic data, such as 2D shapes and animated game sprites, and on real-world video frames. We present analyses of the learned network representations, showing it is implicitly learning a compact encoding of object appearance and motion. We also demonstrate a few of its applications, including visual analogy-making and video extrapolation.
View details for DOI 10.1109/TPAMI.2018.2854726
View details for Web of Science ID 000480343900014
View details for PubMedID 30004870
-
Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling
edited by Lee, D. D., Sugiyama, M., Luxburg, U. V., Guyon, Garnett, R.
NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2016
View details for Web of Science ID 000458973700060
-
Galileo: Perceiving Physical Object Properties by Integrating a Physics Engine with Deep Learning
edited by Cortes, C., Lawrence, N. D., Lee, D. D., Sugiyama, M., Garnett, R.
NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2015
View details for Web of Science ID 000450913101040
-
3D Congealing: 3D-Aware Image Alignment in the Wild
edited by Roth, S., Russakovsky, O., Sattler, T., Varol, G., Leonardis, A., Ricci, E.
SPRINGER INTERNATIONAL PUBLISHING AG. 2025: 387-404
View details for DOI 10.1007/978-3-031-73232-4_22
View details for Web of Science ID 001346378300022
-
Controllable Human-Object Interaction Synthesis
edited by Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G.
SPRINGER INTERNATIONAL PUBLISHING AG. 2025: 54-72
View details for DOI 10.1007/978-3-031-72940-9_4
View details for Web of Science ID 001410968100004
-
CRAFT: Designing Creative and Functional 3D Objects
IEEE COMPUTER SOC. 2025: 7215-7224
View details for DOI 10.1109/WACV61041.2025.00701
View details for Web of Science ID 001521272600212
-
PhysDreamer: Physics-Based Interaction with 3D Objects via Video Generation
edited by Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G.
SPRINGER INTERNATIONAL PUBLISHING AG. 2025: 388-406
View details for DOI 10.1007/978-3-031-72627-9_22
View details for Web of Science ID 001352783900022
-
Reconstruction and Simulation of Elastic Objects with Spring-Mass 3D Gaussians
edited by Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G.
SPRINGER INTERNATIONAL PUBLISHING AG. 2025: 407-423
View details for DOI 10.1007/978-3-031-72627-9_23
View details for Web of Science ID 001352783900023
-
Ponymation: Learning Articulated 3D Animal Motions from Unlabeled Online Videos
edited by Roth, S., Russakovsky, O., Sattler, T., Varol, G., Leonardis, A., Ricci, E.
SPRINGER INTERNATIONAL PUBLISHING AG. 2025: 100-119
View details for DOI 10.1007/978-3-031-73232-4_6
View details for Web of Science ID 001346378300006
-
MOTIVATING INFORMATION-SEEKING BEHAVIORS FOR NEW TECHNOLOGY ACROSS THE LIFE SPAN
OXFORD UNIV PRESS. 2024: 647
View details for DOI 10.1093/geroni/igae098.2118
View details for Web of Science ID 001388133800001
-
DAILY AND TECHNOLOGICAL CHALLENGES AND NEEDS IN OLDER AGES: A MIXED METHODS STUDY
OXFORD UNIV PRESS. 2024
View details for DOI 10.1093/geroni/igae098.0561
View details for Web of Science ID 001394198700218
-
An Eulerian Vortex Method on Flow Maps
ACM TRANSACTIONS ON GRAPHICS
2024; 43 (6)
View details for DOI 10.1145/3687996
View details for Web of Science ID 001368359800001
-
Foundation models in robotics: Applications, challenges, and the future
INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH
2024
View details for DOI 10.1177/02783649241281508
View details for Web of Science ID 001319814900001
-
Partial-View Object View Synthesis via Filtering Inversion
IEEE COMPUTER SOC. 2024: 453-463
View details for DOI 10.1109/3DV62453.2024.00105
View details for Web of Science ID 001250581700033
-
Evaluating Real-World Robot Manipulation Policies in Simulation
edited by Kroemer, O., Agrawal, P., Burgard, W.
JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2024
View details for Web of Science ID 001483833800174
-
View-Invariant Policy Learning via Zero-Shot Novel View Synthesis
edited by Kroemer, O., Agrawal, P., Burgard, W.
JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2024
View details for Web of Science ID 001483833800057
-
Efficient Imitation Learning with Conservative World Models
edited by Abate, A., Cannon, M., Margellos, K., Papachristodoulou, A.
JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2024: 1776-1789
View details for Web of Science ID 001347137500136
-
WonderJourney: Going from Anywhere to Everywhere
IEEE COMPUTER SOC. 2024: 6658-6667
View details for DOI 10.1109/CVPR52733.2024.00636
View details for Web of Science ID 001322555907006
-
Naturally Supervised 3D Visual Grounding with Language-Regularized Concept Learners
IEEE COMPUTER SOC. 2024: 13269-13278
View details for DOI 10.1109/CVPR52733.2024.01260
View details for Web of Science ID 001342442404060
-
HOLODECK: Language Guided Generation of 3D Embodied AI Environments
IEEE COMPUTER SOC. 2024: 16227-16237
View details for DOI 10.1109/CVPR52733.2024.01536
View details for Web of Science ID 001342442407059
-
Physically Grounded Vision-Language Models for Robotic Manipulation
IEEE. 2024: 12462-12469
View details for DOI 10.1109/ICRA57147.2024.10610090
View details for Web of Science ID 001369728002120
-
CityPulse: Fine-Grained Assessment of Urban Change with Street View Time Series
edited by Wooldridge, M., Dy, J., Natarajan, S.
ASSOC ADVANCEMENT ARTIFICIAL INTELLIGENCE. 2024: 22123-22131
View details for Web of Science ID 001239985800028
-
DiffSound: Differentiable Modal Sound Rendering and Inverse Rendering for Diverse Inference Tasks
ASSOC COMPUTING MACHINERY. 2024
View details for DOI 10.1145/3641519.3657493
View details for Web of Science ID 001282218200099
-
Learning Compositional Behaviors from Demonstration and Language
edited by Kroemer, O., Agrawal, P., Burgard, W.
JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2024
View details for Web of Science ID 001483833800095
-
D<SUP>3</SUP>Fields: Dynamic 3D Descriptor Fields for Zero-Shot Generalizable Rearrangement
edited by Kroemer, O., Agrawal, P., Burgard, W.
JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2024
View details for Web of Science ID 001483833800015
-
D<SUP>3</SUP>Fields: Dynamic 3D Descriptor Fields for Zero-Shot Generalizable Rearrangement
edited by Kroemer, O., Agrawal, P., Burgard, W.
JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2024
View details for Web of Science ID 001483833800014
-
TRANSIC: Sim-to-Real Policy Transfer by Learning from Online Correction
edited by Kroemer, O., Agrawal, P., Burgard, W.
JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2024
View details for Web of Science ID 001483833800080
-
Learning to Design 3D Printable Adaptations on Everyday Objects for Robot Manipulation
IEEE. 2024: 824-830
View details for DOI 10.1109/ICRA57147.2024.10610268
View details for Web of Science ID 001294576201003
-
BEHAVIOR Vision Suite: Customizable Dataset Generation via Simulation
IEEE COMPUTER SOC. 2024: 22401-22412
View details for DOI 10.1109/CVPR52733.2024.02114
View details for Web of Science ID 001342515505071
-
Open X-Embodiment: Robotic Learning Datasets and RT-X Models
IEEE. 2024: 6892-6903
View details for DOI 10.1109/ICRA57147.2024.10611477
View details for Web of Science ID 001294576205021
-
Hearing Anything Anywhere
IEEE COMPUTER SOC. 2024: 11790-11799
View details for DOI 10.1109/CVPR52733.2024.01120
View details for Web of Science ID 001342442403015
-
Learning the 3D Fauna of the Web
IEEE COMPUTER SOC. 2024: 9752-9762
View details for DOI 10.1109/CVPR52733.2024.00931
View details for Web of Science ID 001342442401010
-
ZeroNVS: Zero-Shot 360-Degree View Synthesis from a Single Image
IEEE COMPUTER SOC. 2024: 9420-9429
View details for DOI 10.1109/CVPR52733.2024.00900
View details for Web of Science ID 001342442400041
-
SkyScript: A Large and Semantically Diverse Vision-Language Dataset for Remote Sensing
edited by Dy, J., Natarajan, S., Wooldridge, M.
ASSOC ADVANCEMENT ARTIFICIAL INTELLIGENCE. 2024: 5805-5813
View details for Web of Science ID 001239936300079
-
ULIP-2: Towards Scalable Multimodal Pre-training for 3D Understanding
IEEE COMPUTER SOC. 2024: 27081-27091
View details for DOI 10.1109/CVPR52733.2024.02558
View details for Web of Science ID 001344387503044
-
RoboCraft: Learning to see, simulate, and shape elasto-plastic objects in 3D with graph networks
INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH
2023
View details for DOI 10.1177/02783649231219020
View details for Web of Science ID 001126259900001
-
Editing Motion Graphics Video via Motion Vectorization and Transformation
ACM TRANSACTIONS ON GRAPHICS
2023; 42 (6)
View details for DOI 10.1145/3618316
View details for Web of Science ID 001139790400057
-
Fluid Simulation on Neural Flow Maps
ACM TRANSACTIONS ON GRAPHICS
2023; 42 (6)
View details for DOI 10.1145/3618392
View details for Web of Science ID 001139790400076
-
Object Motion Guided Human Motion Synthesis
ACM TRANSACTIONS ON GRAPHICS
2023; 42 (6)
View details for DOI 10.1145/3618333
View details for Web of Science ID 001139790400025
-
Differentiable Physics Simulation of Dynamics-Augmented Neural Objects
IEEE ROBOTICS AND AUTOMATION LETTERS
2023; 8 (5): 2780-2787
View details for DOI 10.1109/LRA.2023.3257707
View details for Web of Science ID 000964797800002
-
Rendering Humans from Object-Occluded Monocular Videos
IEEE COMPUTER SOC. 2023: 3216-3227
View details for DOI 10.1109/ICCV51070.2023.00300
View details for Web of Science ID 001159644303043
-
Learning Rational Subgoals from Demonstrations and Instructions
edited by Williams, B., Chen, Y., Neville, J.
ASSOC ADVANCEMENT ARTIFICIAL INTELLIGENCE. 2023: 12068-12078
View details for Web of Science ID 001243749200063
-
Benchmarking Rigid Body Contact Models
edited by Pappas, G. J., Matni, N., Morari, M.
JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2023
View details for Web of Science ID 001221742900113
-
Can Visual Scratchpads With Diagrammatic Abstractions Augment LLM Reasoning?
edited by Antoran, J., Blaas, A., Buchanan, K., Feng, F., Fortuin, Ghalebikesabi, S., Kriegler, A., Mason, Rohde, D., Ruiz, F. J., Uelwer, T., Xie, Y., Yang, R.
JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2023: 21-28
View details for Web of Science ID 001347141700002
-
Compositional Diffusion-Based Continuous Constraint Solvers
edited by Tan, J., Toussaint, M., Darvish, K.
JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2023
View details for Web of Science ID 001221201503016
-
Composable Part-Based Manipulation
edited by Tan, J., Toussaint, M., Darvish, K.
JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2023
View details for Web of Science ID 001221201501019
-
Inferring Hybrid Neural Fluid Fields from Videos
edited by Oh, A., Neumann, T., Globerson, A., Saenko, K., Hardt, M., Levine, S.
NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2023
View details for Web of Science ID 001224281507019
-
Disentanglement via Latent Quantization
edited by Oh, A., Neumann, T., Globerson, A., Saenko, K., Hardt, M., Levine, S.
NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2023
View details for Web of Science ID 001220818801008
-
Ego-Body Pose Estimation via Ego-Head Pose Estimation
IEEE COMPUTER SOC. 2023: 17142-17151
View details for DOI 10.1109/CVPR52729.2023.01644
View details for Web of Science ID 001062531301043
-
Stanford-ORB: A Real-World 3D Object Inverse Rendering Benchmark
edited by Oh, A., Neumann, T., Globerson, A., Saenko, K., Hardt, M., Levine, S.
NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2023
View details for Web of Science ID 001229751906005
-
What's <i>Left</i>? Concept Grounding with Logic-Enhanced Foundation Models
edited by Oh, A., Neumann, T., Globerson, A., Saenko, K., Hardt, M., Levine, S.
NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2023
View details for Web of Science ID 001230083405016
-
NS3D: Neuro-Symbolic Grounding of 3D Objects and Relations
IEEE COMPUTER SOC. 2023: 2614-2623
View details for DOI 10.1109/CVPR52729.2023.00257
View details for Web of Science ID 001058542602090
-
ULIP: Learning a Unified Representation of Language, Images, and Point Clouds for 3D Understanding
IEEE COMPUTER SOC. 2023: 1179-1189
View details for DOI 10.1109/CVPR52729.2023.00120
View details for Web of Science ID 001058542601047
-
3D Copy-Paste: Physically Plausible Object Insertion for Monocular 3D Detection
edited by Oh, A., Neumann, T., Globerson, A., Saenko, K., Hardt, M., Levine, S.
NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2023
View details for Web of Science ID 001202273400003
-
Accidental Light Probes
IEEE COMPUTER SOC. 2023: 12521-12530
View details for DOI 10.1109/CVPR52729.2023.01205
View details for Web of Science ID 001062522104081
-
Multi-Object Manipulation via Object-Centric Neural Scattering Functions
IEEE COMPUTER SOC. 2023: 9021-9031
View details for DOI 10.1109/CVPR52729.2023.00871
View details for Web of Science ID 001062522101031
-
PyPose: A Library for Robot Learning with Physics-based Optimization
IEEE COMPUTER SOC. 2023: 22024-22034
View details for DOI 10.1109/CVPR52729.2023.02109
View details for Web of Science ID 001062531306035
-
Tree-Structured Shading Decomposition
IEEE COMPUTER SOC. 2023: 488-498
View details for DOI 10.1109/ICCV51070.2023.00051
View details for Web of Science ID 001159644300045
-
Model-Based Control with Sparse Neural Dynamics
edited by Oh, A., Neumann, T., Globerson, A., Saenko, K., Hardt, M., Levine, S.
NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2023
View details for Web of Science ID 001226352807027
-
Learning to Design and Use Tools for Robotic Manipulation
edited by Tan, J., Toussaint, M., Darvish, K.
JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2023
View details for Web of Science ID 001221201500042
-
VoxPoser: Composable 3D Value Maps for Robotic Manipulation with Language Models
edited by Tan, J., Toussaint, M., Darvish, K.
JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2023
View details for Web of Science ID 001221201500025
-
NOIR: Neural Signal Operated Intelligent Robots for Everyday Activities
edited by Tan, J., Toussaint, M., Darvish, K.
JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2023
View details for Web of Science ID 001221201501042
-
Siamese Masked Autoencoders
edited by Oh, A., Neumann, T., Globerson, A., Saenko, K., Hardt, M., Levine, S.
NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2023
View details for Web of Science ID 001228825108029
-
Learning Sequential Acquisition Policies for Robot-Assisted Feeding
edited by Tan, J., Toussaint, M., Darvish, K.
JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2023
View details for Web of Science ID 001221201501018
-
Holistic Evaluation of Text-to-Image Models
edited by Oh, A., Neumann, T., Globerson, A., Saenko, K., Hardt, M., Levine, S.
NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2023
View details for Web of Science ID 001224281504033
-
3D Neural Field Generation using Triplane Diffusion
IEEE COMPUTER SOC. 2023: 20875-20886
View details for DOI 10.1109/CVPR52729.2023.02000
View details for Web of Science ID 001062531305021
-
CIRCLE: Capture In Rich Contextual Environments
IEEE COMPUTER SOC. 2023: 21211-21221
View details for DOI 10.1109/CVPR52729.2023.02032
View details for Web of Science ID 001062531305053
-
Primitive Skill-based Robot Learning from Human Evaluative Feedback
IEEE. 2023: 7817-7824
View details for DOI 10.1109/IROS55552.2023.10341912
View details for Web of Science ID 001136907802021
-
SOUNDCAM: A Dataset for Finding Humans Using Room Acoustics
edited by Oh, A., Neumann, T., Globerson, A., Saenko, K., Hardt, M., Levine, S.
NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2023
View details for Web of Science ID 001229751904038
-
STAP: Sequencing Task-Agnostic Policies
IEEE. 2023: 7951-7958
View details for DOI 10.1109/ICRA48891.2023.10160220
View details for Web of Science ID 001048371101041
-
The OBJECTFOLDER BENCHMARK: Multisensory Learning with <i>Neural</i> and <i>Real</i> Objects
IEEE COMPUTER SOC. 2023: 17276-17286
View details for DOI 10.1109/CVPR52729.2023.01657
View details for Web of Science ID 001062531301056
-
Task-Driven Graph Attention for Hierarchical Relational Object Navigation
IEEE. 2023: 886-893
View details for DOI 10.1109/ICRA48891.2023.10161157
View details for Web of Science ID 001036713000053
-
SONICVERSE: A Multisensory Simulation Platform for Embodied Household Agents that See and Hear
IEEE. 2023: 704-711
View details for DOI 10.1109/ICRA48891.2023.10160461
View details for Web of Science ID 001036713000028
-
Putting People in Their Place: Affordance-Aware Human Insertion into Scenes
IEEE COMPUTER SOC. 2023: 17089-17099
View details for DOI 10.1109/CVPR52729.2023.01639
View details for Web of Science ID 001062531301038
-
VQ3D: Learning a 3D-Aware Generative Model on ImageNet
IEEE COMPUTER SOC. 2023: 4217-4227
View details for DOI 10.1109/ICCV51070.2023.00391
View details for Web of Science ID 001159644304044
-
Scene Synthesis from Human Motion
edited by Spencer, S. N.
ASSOC COMPUTING MACHINERY. 2022
View details for DOI 10.1145/3550469.3555426
View details for Web of Science ID 001074614400051
-
Unsupervised Segmentation in Real-World Images via Spelke Object Inference
edited by Avidan, S., Brostow, G., Cisse, M., Farinella, G. M., Hassner, T.
SPRINGER INTERNATIONAL PUBLISHING AG. 2022: 719-735
View details for DOI 10.1007/978-3-031-19818-2_41
View details for Web of Science ID 000903735000041
-
Rotationally Equivariant 3D Object Detection
IEEE COMPUTER SOC. 2022: 1446-1454
View details for DOI 10.1109/CVPR52688.2022.00151
View details for Web of Science ID 000867754201068
-
Translating a Visual LEGO Manual to a Machine-Executable Plan
edited by Avidan, S., Brostow, G., Cisse, M., Farinella, G. M., Hassner, T.
SPRINGER INTERNATIONAL PUBLISHING AG. 2022: 677-694
View details for DOI 10.1007/978-3-031-19836-6_38
View details for Web of Science ID 000903756400038
-
Video Extrapolation in Space and Time
edited by Avidan, S., Brostow, G., Cisse, M., Farinella, G. M., Hassner, T.
SPRINGER INTERNATIONAL PUBLISHING AG. 2022: 313-333
View details for DOI 10.1007/978-3-031-19787-1_18
View details for Web of Science ID 000904102900018
-
Programmatic Concept Learning for Human Motion Description and Synthesis
IEEE COMPUTER SOC. 2022: 13833-13842
View details for DOI 10.1109/CVPR52688.2022.01347
View details for Web of Science ID 000870759106090
-
RoboCraft: Learning to See, Simulate, and Shape Elasto-Plastic Objects with Graph Networks
edited by Hauser, K., Shell, D., Huang, S.
RSS FOUNDATION-ROBOTICS SCIENCE & SYSTEMS FOUNDATION. 2022
View details for Web of Science ID 000827625700008
-
Revisiting the "Video" in Video-Language Understanding
IEEE COMPUTER SOC. 2022: 2907-2917
View details for DOI 10.1109/CVPR52688.2022.00293
View details for Web of Science ID 000867754203017
-
Repopulating Street Scenes
IEEE COMPUTER SOC. 2021: 5106-5115
View details for DOI 10.1109/CVPR46437.2021.00507
View details for Web of Science ID 000739917305031
-
Neural Radiance Flow for 4D View Synthesis and Video Processing
IEEE. 2021: 14304-14314
View details for DOI 10.1109/ICCV48922.2021.01406
View details for Web of Science ID 000798743204050
-
Learning Temporal Dynamics from Cycles in Narrated Video
IEEE. 2021: 1460-1469
View details for DOI 10.1109/ICCV48922.2021.00151
View details for Web of Science ID 000797698901064
-
Augmenting Policy Learning with Routines Discovered from a Single Demonstration
ASSOC ADVANCEMENT ARTIFICIAL INTELLIGENCE. 2021: 11024-11032
View details for Web of Science ID 000681269802081
-
KeypointDeformer: Unsupervised 3D Keypoint Discovery for Shape Control
IEEE COMPUTER SOC. 2021: 12778-12787
View details for DOI 10.1109/CVPR46437.2021.01259
View details for Web of Science ID 000742075002096
-
De-rendering the World's Revolutionary Artefacts
IEEE COMPUTER SOC. 2021: 6334-6343
View details for DOI 10.1109/CVPR46437.2021.00627
View details for Web of Science ID 000739917306054
-
Hierarchical Motion Understanding via Motion Programs
IEEE COMPUTER SOC. 2021: 6564-6572
View details for DOI 10.1109/CVPR46437.2021.00650
View details for Web of Science ID 000739917306077
-
pi-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis
IEEE COMPUTER SOC. 2021: 5795-5805
View details for DOI 10.1109/CVPR46437.2021.00574
View details for Web of Science ID 000739917306001
-
Learning Generative Models of 3D Structures
WILEY. 2020: 643–66
View details for DOI 10.1111/cgf.14020
View details for Web of Science ID 000548709600052
-
End-to-End Optimization of Scene Layout
IEEE. 2020: 3753–62
View details for DOI 10.1109/CVPR42600.2020.00381
View details for Web of Science ID 000620679504003
-
DualSMC: Tunneling Differentiable Filtering and Planning under Continuous POMDPs
edited by Bessiere, C.
IJCAI-INT JOINT CONF ARTIF INTELL. 2020: 4190-4198
View details for Web of Science ID 000764196704043
-
Accurate Vision-based Manipulation through Contact Reasoning
IEEE. 2020: 6738-6744
View details for Web of Science ID 000712319504073
-
Visual Grounding of Learned Physical Models
edited by Daume, H., Singh, A.
JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2020
View details for Web of Science ID 000683178506006
-
Perspective Plane Program Induction from a Single Image
IEEE. 2020: 4433–42
View details for DOI 10.1109/CVPR42600.2020.00449
View details for Web of Science ID 000620679504071
-
Video Enhancement with Task-Oriented Flow
INTERNATIONAL JOURNAL OF COMPUTER VISION
2019; 127 (8): 1106–25
View details for DOI 10.1007/s11263-018-01144-2
View details for Web of Science ID 000474559000008
-
An integrative computational architecture for object-driven cortex
CURRENT OPINION IN NEUROBIOLOGY
2019; 55: 73–81
Abstract
Computational architecture for object-driven cortex Objects in motion activate multiple cortical regions in every lobe of the human brain. Do these regions represent a collection of independent systems, or is there an overarching functional architecture spanning all of object-driven cortex? Inspired by recent work in artificial intelligence (AI), machine learning, and cognitive science, we consider the hypothesis that these regions can be understood as a coherent network implementing an integrative computational system that unifies the functions needed to perceive, predict, reason about, and plan with physical objects-as in the paradigmatic case of using or making tools. Our proposal draws on a modeling framework that combines multiple AI methods, including causal generative models, hybrid symbolic-continuous planning algorithms, and neural recognition networks, with object-centric, physics-based representations. We review evidence relating specific components of our proposal to the specific regions that comprise object-driven cortex, and lay out future research directions with the goal of building a complete functional and mechanistic account of this system.
View details for DOI 10.1016/j.conb.2019.01.010
View details for Web of Science ID 000472127600011
View details for PubMedID 30825704
View details for PubMedCentralID PMC6548583
-
See, feel, act: Hierarchical learning for complex manipulation skills with multisensory fusion
SCIENCE ROBOTICS
2019; 4 (26)
View details for DOI 10.1126/scirobotics.aav3123
View details for Web of Science ID 000458560100005
-
Visual Concept-Metaconcept Learning
edited by Wallach, H., Larochelle, H., Beygelzimer, A., d'Alche-Buc, F., Fox, E., Garnett, R.
NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2019
View details for Web of Science ID 000534424305005
-
Combining Physical Simulators and Object-Based Networks for Control
edited by Howard, A., Althoefer, K., Arai, F., Arrichiello, F., Caputo, B., Castellanos, J., Hauser, K., Isler, Kim, J., Liu, H., Oh, P., Santos, Scaramuzza, D., Ude, A., Voyles, R., Yamane, K., Okamura, A.
IEEE. 2019: 3217–23
View details for Web of Science ID 000494942302054
-
Propagation Networks for Model-Based Control Under Partial Observation
edited by Howard, A., Althoefer, K., Arai, F., Arrichiello, F., Caputo, B., Castellanos, J., Hauser, K., Isler, Kim, J., Liu, H., Oh, P., Santos, Scaramuzza, D., Ude, A., Voyles, R., Yamane, K., Okamura, A.
IEEE. 2019: 1205–11
View details for Web of Science ID 000494942300127
-
ChainQueen: A Real-Time Differentiable Physical Simulator for Soft Robotics
edited by Howard, A., Althoefer, K., Arai, F., Arrichiello, F., Caputo, B., Castellanos, J., Hauser, K., Isler, Kim, J., Liu, H., Oh, P., Santos, Scaramuzza, D., Ude, A., Voyles, R., Yamane, K., Okamura, A.
IEEE. 2019: 6265–71
View details for Web of Science ID 000494942304086
-
Program-Guided Image Manipulators
IEEE COMPUTER SOC. 2019: 4029–38
View details for DOI 10.1109/ICCV.2019.00413
View details for Web of Science ID 000531438104018
-
Modeling Expectation Violation in Intuitive Physics with Coarse Probabilistic Object Representations
edited by Wallach, H., Larochelle, H., Beygelzimer, A., d'Alche-Buc, F., Fox, E., Garnett, R.
NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2019
View details for Web of Science ID 000535866900056
-
Learning Sight from Sound: Ambient Sound Provides Supervision for Visual Learning
INTERNATIONAL JOURNAL OF COMPUTER VISION
2018; 126 (10): 1120–37
View details for DOI 10.1007/s11263-018-1083-5
View details for Web of Science ID 000443018400004
-
3D Interpreter Networks for Viewer-Centered Wireframe Modeling
INTERNATIONAL JOURNAL OF COMPUTER VISION
2018; 126 (9): 1009–26
View details for DOI 10.1007/s11263-018-1074-6
View details for Web of Science ID 000441553300008
-
Augmenting Physical Simulators with Stochastic Neural Networks: Case Study of Planar Pushing and Bouncing
edited by Maciejewski, A. A., Okamura, A., Bicchi, A., Stachniss, C., Song, D. Z., Lee, D. H., Chaumette, F., Ding, H., Li, J. S., Wen, J., Roberts, J., Masamune, K., Chong, N. Y., Amato, N., Tsagwarakis, N., Rocco, P., Asfour, T., Chung, W. K., Yasuyoshi, Y., Sun, Y., Maciekeski, T., Althoefer, K., AndradeCetto, J., Chung, W. K., Demircan, E., Dias, J., Fraisse, P., Gross, R., Harada, H., Hasegawa, Y., Hayashibe, M., Kiguchi, K., Kim, K., Kroeger, T., Li, Y., Ma, S., Mochiyama, H., Monje, C. A., Rekleitis, Roberts, R., Stulp, F., Tsai, C. H., Zollo, L.
IEEE. 2018: 3066–73
View details for Web of Science ID 000458872702129
-
Unsupervised Learning of Latent Physical Properties Using Perception-Prediction Networks
edited by Globerson, A., Silva, R.
AUAI PRESS. 2018: 497–507
View details for Web of Science ID 000493119200049
-
Physical Primitive Decomposition
edited by Ferrari, Hebert, M., Sminchisescu, C., Weiss, Y.
SPRINGER INTERNATIONAL PUBLISHING AG. 2018: 3–20
View details for DOI 10.1007/978-3-030-01258-8_1
View details for Web of Science ID 000604449400001
-
Seeing Tree Structure from Vibration
edited by Ferrari, Hebert, M., Sminchisescu, C., Weiss, Y.
SPRINGER INTERNATIONAL PUBLISHING AG. 2018: 762–79
View details for DOI 10.1007/978-3-030-01240-3_46
View details for Web of Science ID 000594233000046
-
MoSculp: Interactive Visualization of Shape and Time
ASSOC COMPUTING MACHINERY. 2018: 275–85
View details for DOI 10.1145/3242587.3242592
View details for Web of Science ID 000494260500025
-
Learning to Reconstruct Shapes from Unseen Classes
edited by Bengio, S., Wallach, H., Larochelle, H., Grauman, K., CesaBianchi, N., Garnett, R.
NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2018
View details for Web of Science ID 000461823302028
-
Attention Clusters: Purely Attention Based Local Feature Integration for Video Classification
IEEE. 2018: 7834–43
View details for DOI 10.1109/CVPR.2018.00817
View details for Web of Science ID 000457843607102
-
Pix3D: Dataset and Methods for Single-Image 3D Shape Modeling
IEEE. 2018: 2974–83
View details for DOI 10.1109/CVPR.2018.00314
View details for Web of Science ID 000457843603012
-
3D Shape Perception from Monocular Vision, Touch, and Shape Priors
edited by Maciejewski, A. A., Okamura, A., Bicchi, A., Stachniss, C., Song, D. Z., Lee, D. H., Chaumette, F., Ding, H., Li, J. S., Wen, J., Roberts, J., Masamune, K., Chong, N. Y., Amato, N., Tsagwarakis, N., Rocco, P., Asfour, T., Chung, W. K., Yasuyoshi, Y., Sun, Y., Maciekeski, T., Althoefer, K., AndradeCetto, J., Chung, W. K., Demircan, E., Dias, J., Fraisse, P., Gross, R., Harada, H., Hasegawa, Y., Hayashibe, M., Kiguchi, K., Kim, K., Kroeger, T., Li, Y., Ma, S., Mochiyama, H., Monje, C. A., Rekleitis, Roberts, R., Stulp, F., Tsai, C. H., Zollo, L.
IEEE. 2018: 1606–13
View details for Web of Science ID 000458872701105
-
3D-Aware Scene Manipulation via Inverse Graphics
edited by Bengio, S., Wallach, H., Larochelle, H., Grauman, K., CesaBianchi, N., Garnett, R.
NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2018
View details for Web of Science ID 000461823301084
-
Visual Object Networks: Image Generation with Disentangled 3D Representation
edited by Bengio, S., Wallach, H., Larochelle, H., Grauman, K., CesaBianchi, N., Garnett, R.
NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2018
View details for Web of Science ID 000461823300012
-
Learning to Exploit Stability for 3D Scene Parsing
edited by Bengio, S., Wallach, H., Larochelle, H., Grauman, K., CesaBianchi, N., Garnett, R.
NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2018
View details for Web of Science ID 000461823301069
-
Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language Understanding
edited by Bengio, S., Wallach, H., Larochelle, H., Grauman, K., CesaBianchi, N., Garnett, R.
NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2018
View details for Web of Science ID 000461823301006
-
Shape and Material from Sound
edited by Guyon, Luxburg, U. V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R.
NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2017
View details for Web of Science ID 000452649401031
-
MarrNet: 3D Shape Reconstruction via 2.5D Sketches
edited by Guyon, Luxburg, U. V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R.
NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2017
View details for Web of Science ID 000452649400052
-
Synthesizing 3D Shapes via Modeling Multi-View Depth Maps and Silhouettes with Deep Generative Networks
IEEE. 2017: 2511–19
View details for DOI 10.1109/CVPR.2017.269
View details for Web of Science ID 000418371402061
-
Neural Scene De-rendering
IEEE. 2017: 7035–43
View details for DOI 10.1109/CVPR.2017.744
View details for Web of Science ID 000418371407015
-
Raster-to-Vector: Revisiting Floorplan Transformation
IEEE. 2017: 2214–22
View details for DOI 10.1109/ICCV.2017.241
View details for Web of Science ID 000425498402029
-
Generative Modeling of Audible Shapes for Object Perception
IEEE. 2017: 1260–69
View details for DOI 10.1109/ICCV.2017.141
View details for Web of Science ID 000425498401034
-
Self-Supervised Intrinsic Image Decomposition
edited by Guyon, Luxburg, U. V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R.
NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2017
View details for Web of Science ID 000452649406002
-
Learning to See Physics via Visual De-animation
edited by Guyon, Luxburg, U. V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R.
NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2017
View details for Web of Science ID 000452649400015
-
Visual Dynamics: Probabilistic Future Frame Synthesis via Cross Convolutional Networks
edited by Lee, D. D., Sugiyama, M., Luxburg, U. V., Guyon, Garnett, R.
NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2016
View details for Web of Science ID 000458973704076
-
Ambient Sound Provides Supervision for Visual Learning
edited by Leibe, B., Matas, J., Sebe, N., Welling, M.
SPRINGER INTERNATIONAL PUBLISHING AG. 2016: 801–16
View details for DOI 10.1007/978-3-319-46448-0_48
View details for Web of Science ID 000389382700048
-
Single Image 3D Interpreter Network
edited by Leibe, B., Matas, J., Sebe, N., Welling, M.
SPRINGER INT PUBLISHING AG. 2016: 365–82
View details for DOI 10.1007/978-3-319-46466-4_22
View details for Web of Science ID 000389499900022
-
Unsupervised Object Class Discovery via Saliency-Guided Multiple Class Learning
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
2015; 37 (4): 862–75
Abstract
In this paper, we tackle the problem of common object (multiple classes) discovery from a set of input images, where we assume the presence of one object class in each image. This problem is, loosely speaking, unsupervised since we do not know a priori about the object type, location, and scale in each image. We observe that the general task of object class discovery in a fully unsupervised manner is intrinsically ambiguous; here we adopt saliency detection to propose candidate image windows/patches to turn an unsupervised learning problem into a weakly-supervised learning problem. In the paper, we propose an algorithm for simultaneously localizing objects and discovering object classes via bottom-up (saliency-guided) multiple class learning (bMCL). Our contributions are three-fold: (1) we adopt saliency detection to convert unsupervised learning into multiple instance learning, formulated as bottom-up multiple class learning (bMCL); (2) we propose an integrated framework that simultaneously performs object localization, object class discovery, and object detector training; (3) we demonstrate that our framework yields significant improvements over existing methods for multi-class object discovery and possess evident advantages over competing methods in computer vision. In addition, although saliency detection has recently attracted much attention, its practical usage for high-level vision tasks has yet to be justified. Our method validates the usefulness of saliency detection to output "noisy input" for a top-down method to extract common patterns.
View details for DOI 10.1109/TPAMI.2014.2353617
View details for Web of Science ID 000351213400012
View details for PubMedID 26353299
-
Deep Multiple Instance Learning for Image Classification and Auto-Annotation
IEEE. 2015: 3460–69
View details for Web of Science ID 000387959203053
-
MILCut: A Sweeping Line Multiple Instance Learning Paradigm for Interactive Image Segmentation
IEEE. 2014: 256–63
View details for DOI 10.1109/CVPR.2014.40
View details for Web of Science ID 000361555600033
-
Harvesting Mid-level Visual Concepts from Large-scale Internet Images
IEEE. 2013: 851–58
View details for DOI 10.1109/CVPR.2013.115
View details for Web of Science ID 000331094300108
-
A classification approach to coreference in discharge summaries: 2011 i2b2 challenge
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION
2012; 19 (5): 897–905
Abstract
To create a highly accurate coreference system in discharge summaries for the 2011 i2b2 challenge. The coreference categories include Person, Problem, Treatment, and Test.An integrated coreference resolution system was developed by exploiting Person attributes, contextual semantic clues, and world knowledge. It includes three subsystems: Person coreference system based on three Person attributes, Problem/Treatment/Test system based on numerous contextual semantic extractors and world knowledge, and Pronoun system based on a multi-class support vector machine classifier. The three Person attributes are patient, relative and hospital personnel. Contextual semantic extractors include anatomy, position, medication, indicator, temporal, spatial, section, modifier, equipment, operation, and assertion. The world knowledge is extracted from external resources such as Wikipedia.Micro-averaged precision, recall and F-measure in MUC, BCubed and CEAF were used to evaluate results.The system achieved an overall micro-averaged precision, recall and F-measure of 0.906, 0.925, and 0.915, respectively, on test data (from four hospitals) released by the challenge organizers. It achieved a precision, recall and F-measure of 0.905, 0.920 and 0.913, respectively, on test data without Pittsburgh data. We ranked the first out of 20 competing teams. Among the four sub-tasks on Person, Problem, Treatment, and Test, the highest F-measure was seen for Person coreference.This system achieved encouraging results. The Person system can determine whether personal pronouns and proper names are coreferent or not. The Problem/Treatment/Test system benefits from both world knowledge in evaluating the similarity of two mentions and contextual semantic extractors in identifying semantic clues. The Pronoun system can automatically detect whether a Pronoun mention is coreferent to that of the other four types. This study demonstrates that it is feasible to accomplish the coreference task in discharge summaries.
View details for DOI 10.1136/amiajnl-2011-000734
View details for Web of Science ID 000307934600030
View details for PubMedID 22505762
View details for PubMedCentralID PMC3422828
-
Unsupervised Object Class Discovery via Saliency-Guided Multiple Class Learning
IEEE. 2012: 3218–25
View details for Web of Science ID 000309166203049
https://orcid.org/0000-0002-4176-343X