Bio


Jiajun Wu is an Assistant Professor of Computer Science and, by courtesy, of Psychology at Stanford University, working on computer vision, machine learning, robotics, and computational cognitive science. Before joining Stanford, he was a Visiting Faculty Researcher at Google Research. He received his PhD in Electrical Engineering and Computer Science from the Massachusetts Institute of Technology. Wu's research has been recognized through the Young Investigator Programs (YIP) by ONR and by AFOSR, the NSF CAREER award, the Okawa research grant, the AI's 10 to Watch by IEEE Intelligent Systems, paper awards and finalists at ICCV, CVPR, SIGGRAPH Asia, ICRA, CoRL, and IROS, dissertation awards from ACM, AAAI, and MIT, the 2020 Samsung AI Researcher of the Year, and faculty research awards from Google, J.P. Morgan, Samsung, Amazon, and Meta.

Honors & Awards


  • Research Scholar Award, Google (2025)
  • Best Paper Award Finalist, ICRA, IEEE (2025)
  • Research Grant, Okawa Foundation (2024)
  • CAREER Award, NSF (2024)
  • Young Investigator Program (YIP), ONR (2024)
  • AI's 10 to Watch, IEEE Intelligent Systems (2024)
  • Best Paper Award, ICRA, IEEE (2024)
  • Innovators Under 35 Asia Pacific, MIT Technology Review (2024)
  • Young Investigator Program (YIP), AFOSR (2023)
  • Best Paper Award, SIGGRAPH Asia, ACM (2023)
  • Best Systems Paper Award, CoRL (2023)
  • Best Paper Award Finalist, ICCV, IEEE/CVF (2023)
  • Best Paper Award Candidate, CVPR, IEEE/CVF (2023)
  • Global Research Outreach (GRO) Award, Samsung (2023)
  • New Faculty Highlights, AAAI (2023)
  • Best Paper Award Nominee, CoRL (2022)
  • Faculty Research Award, J.P. Morgan (2022)
  • 30 Under 30, Science, Forbes (2022)
  • Early Career Professor Award Finalist, Agilent (2022)
  • Research Award, Meta (2021)
  • Research Award, Amazon (2021)
  • AI Researcher of the Year, Samsung (2020)
  • Global Research Outreach (GRO) Award, Samsung (2020)
  • George M. Sprowls PhD Thesis Award in Artificial Intelligence and Decision-Making, MIT (2020)
  • Doctoral Dissertation Award Honorable Mention, ACM (2019)
  • Dissertation Award, AAAI/ACM SIGAI (2019)
  • PhD Fellowship, Facebook (2017--2019)
  • Best Paper Award on Cognitive Robotics, IROS, IEEE/RSJ (2018)
  • PhD Fellowship, Samsung (2016--2017)
  • Graduate Fellowship, Nvidia (2016--2017)
  • Research Fellowship, Adobe (2015)
  • Edwin S. Webster Fellowship, MIT (2014)

Program Affiliations


  • Stanford SystemX Alliance
  • Symbolic Systems Program

Professional Education


  • Ph.D., MIT, EECS
  • S.M., MIT, EECS

2025-26 Courses


Stanford Advisees


All Publications


  • A review of learning-based dynamics models for robotic manipulation. Science robotics Ai, B., Tian, S., Shi, H., Wang, Y., Pfaff, T., Tan, C., Christensen, H. I., Su, H., Wu, J., Li, Y. 2025; 10 (106): eadt1497

    Abstract

    Dynamics models that predict the effects of physical interactions are essential for planning and control in robotic manipulation. Although models based on physical principles often generalize well, they typically require full-state information, which can be difficult or impossible to extract from perception data in complex, real-world scenarios. Learning-based dynamics models provide an alternative by deriving state transition functions purely from perceived interaction data, enabling the capture of complex, hard-to-model factors and predictive uncertainty and accelerating simulations that are often too slow for real-time control. Recent successes in this field have demonstrated notable advancements in robot capabilities, including long-horizon manipulation of deformable objects, granular materials, and complex multiobject interactions such as stowing and packing. A crucial aspect of these investigations is the choice of state representation, which determines the inductive biases in the learning system for reduced-order modeling of scene dynamics. This article provides a timely and comprehensive review of current techniques and trade-offs in designing learned dynamics models, highlighting their role in advancing robot capabilities through integration with state estimation and control and identifying critical research gaps for future exploration.

    View details for DOI 10.1126/scirobotics.adt1497

    View details for PubMedID 40961212

  • Physical scene understanding AI MAGAZINE Wu, J. 2024

    View details for DOI 10.1002/aaai.12148

    View details for Web of Science ID 001158170300001

  • Neurosymbolic Models for Computer Graphics COMPUTER GRAPHICS FORUM Ritchie, D., Guerrero, P., Jones, R., Mitra, N. J., Schulz, A., Willis, K. D. D., Wu, J. 2023; 42 (2): 545-568

    View details for DOI 10.1111/cgf.14775

    View details for Web of Science ID 001000062600040

  • Visual Dynamics: Stochastic Future Generation via Layered Cross Convolutional Networks IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE Xue, T., Wu, J., Bouman, K. L., Freeman, W. T. 2019; 41 (9): 2236–50

    Abstract

    We study the problem of synthesizing a number of likely future frames from a single input image. In contrast to traditional methods that have tackled this problem in a deterministic or non-parametric way, we propose to model future frames in a probabilistic manner. Our probabilistic model makes it possible for us to sample and synthesize many possible future frames from a single input image. To synthesize realistic movement of objects, we propose a novel network structure, namely a Cross Convolutional Network; this network encodes image and motion information as feature maps and convolutional kernels, respectively. In experiments, our model performs well on synthetic data, such as 2D shapes and animated game sprites, and on real-world video frames. We present analyses of the learned network representations, showing it is implicitly learning a compact encoding of object appearance and motion. We also demonstrate a few of its applications, including visual analogy-making and video extrapolation.

    View details for DOI 10.1109/TPAMI.2018.2854726

    View details for Web of Science ID 000480343900014

    View details for PubMedID 30004870

  • 3D Congealing: 3D-Aware Image Alignment in the Wild Zhang, Y., Li, Z., Raj, A., Engelhardt, A., Li, Y., Hou, T., Wu, J., Jampani, V. edited by Roth, S., Russakovsky, O., Sattler, T., Varol, G., Leonardis, A., Ricci, E. SPRINGER INTERNATIONAL PUBLISHING AG. 2025: 387-404
  • Controllable Human-Object Interaction Synthesis Li, J., Clegg, A., Mottaghi, R., Wu, J., Puig, X., Liu, C. edited by Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. SPRINGER INTERNATIONAL PUBLISHING AG. 2025: 54-72
  • CRAFT: Designing Creative and Functional 3D Objects Guo, M., Tang, M., Cha, H., Zhang, R., Liu, C., Wu, J., IEEE COMPUTER SOC IEEE COMPUTER SOC. 2025: 7215-7224
  • PhysDreamer: Physics-Based Interaction with 3D Objects via Video Generation Zhang, T., Yu, H., Wu, R., Feng, B. Y., Zheng, C., Snavely, N., Wu, J., Freeman, W. T. edited by Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. SPRINGER INTERNATIONAL PUBLISHING AG. 2025: 388-406
  • Reconstruction and Simulation of Elastic Objects with Spring-Mass 3D Gaussians Zhong, L., Yu, H., Wu, J., Li, Y. edited by Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. SPRINGER INTERNATIONAL PUBLISHING AG. 2025: 407-423
  • Ponymation: Learning Articulated 3D Animal Motions from Unlabeled Online Videos Sun, K., Litvak, D., Zhang, Y., Li, H., Wu, J., Wu, S. edited by Roth, S., Russakovsky, O., Sattler, T., Varol, G., Leonardis, A., Ricci, E. SPRINGER INTERNATIONAL PUBLISHING AG. 2025: 100-119
  • MOTIVATING INFORMATION-SEEKING BEHAVIORS FOR NEW TECHNOLOGY ACROSS THE LIFE SPAN Chu, L., Patterson, K., Kim, T., Srivastava, S., Zhang, R., Wu, J., Li, F., Carstensen, L. OXFORD UNIV PRESS. 2024: 647
  • DAILY AND TECHNOLOGICAL CHALLENGES AND NEEDS IN OLDER AGES: A MIXED METHODS STUDY Cruz, M., Chu, L., Gomezjurado Gonzalez, L., Zhang, R., Wu, J., Fei-Fei, L., Carstensen, L. OXFORD UNIV PRESS. 2024
  • An Eulerian Vortex Method on Flow Maps ACM TRANSACTIONS ON GRAPHICS Wang, S., Deng, Y., Deng, M., Yu, H., Zhou, J., Chen, D., Komura, T., Wu, J., Zhu, B. 2024; 43 (6)

    View details for DOI 10.1145/3687996

    View details for Web of Science ID 001368359800001

  • Foundation models in robotics: Applications, challenges, and the future INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH Firoozi, R., Tucker, J., Tian, S., Majumdar, A., Sun, J., Liu, W., Zhu, Y., Song, S., Kapoor, A., Hausman, K., Ichter, B., Driess, D., Wu, J., Lu, C., Schwager, M. 2024
  • Partial-View Object View Synthesis via Filtering Inversion Sun, F., Tremblay, J., Blukis, V., Lin, K., Xu, D., Ivanovic, B., Karkus, P., Birchfield, S., Fox, D., Zhang, R., Li, Y., Wu, J., Pavone, M., Haber, N., IEEE COMPUTER SOC IEEE COMPUTER SOC. 2024: 453-463
  • Evaluating Real-World Robot Manipulation Policies in Simulation Li, X., Hsu, K., Gu, J., Pertsch, K., Mees, O., Walke, H., Fu, C., Lunawat, I., Sieh, I., Kirmani, S., Levine, S., Wu, J., Finn, C., Su, H., Vuong, Q., Xiao, T. edited by Kroemer, O., Agrawal, P., Burgard, W. JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2024
  • View-Invariant Policy Learning via Zero-Shot Novel View Synthesis Tian, S., Wulfe, B., Sargent, K., Liu, K., Zakharov, S., Guizilini, V., Wu, J. edited by Kroemer, O., Agrawal, P., Burgard, W. JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2024
  • Efficient Imitation Learning with Conservative World Models Kolev, V., Rafailov, R., Hatch, K., Wu, J., Finn, C. edited by Abate, A., Cannon, M., Margellos, K., Papachristodoulou, A. JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2024: 1776-1789
  • WonderJourney: Going from Anywhere to Everywhere Yu, H., Duan, H., Hur, J., Sargent, K., Rubinstein, M., Freeman, W. T., Cole, F., Sun, D., Snavely, N., Wu, J., Herrmann, C., IEEE COMPUTER SOC IEEE COMPUTER SOC. 2024: 6658-6667
  • Naturally Supervised 3D Visual Grounding with Language-Regularized Concept Learners Feng, C., Hsu, J., Liu, W., Wu, J., IEEE IEEE COMPUTER SOC. 2024: 13269-13278
  • HOLODECK: Language Guided Generation of 3D Embodied AI Environments Yang, Y., Sun, F., Weihs, L., Vanderbilt, E., Herrasti, A., Han, W., Wu, J., Haber, N., Krishna, R., Liu, L., Callison-Burch, C., Yatskar, M., Kembhavi, A., Clark, C., IEEE IEEE COMPUTER SOC. 2024: 16227-16237
  • Physically Grounded Vision-Language Models for Robotic Manipulation Gao, J., Sarkar, B., Xia, F., Xiao, T., Wu, J., Ichter, B., Majumdar, A., Sadigh, D., IEEE IEEE. 2024: 12462-12469
  • CityPulse: Fine-Grained Assessment of Urban Change with Street View Time Series Huang, T., Wu, Z., Wu, J., Hwang, J., Rajagopal, R. edited by Wooldridge, M., Dy, J., Natarajan, S. ASSOC ADVANCEMENT ARTIFICIAL INTELLIGENCE. 2024: 22123-22131
  • DiffSound: Differentiable Modal Sound Rendering and Inverse Rendering for Diverse Inference Tasks Jin, X., Xu, C., Gao, R., Wu, J., Wang, G., Li, S., Spencer, S. ASSOC COMPUTING MACHINERY. 2024
  • Learning Compositional Behaviors from Demonstration and Language Liu, W., Nie, N., Zhang, R., Mao, J., Wu, J. edited by Kroemer, O., Agrawal, P., Burgard, W. JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2024
  • D<SUP>3</SUP>Fields: Dynamic 3D Descriptor Fields for Zero-Shot Generalizable Rearrangement Wang, Y., Zhang, M., Li, Z., Kelestemur, T., Driggs-Campbell, K., Wu, J., Fei-Fei, L., Li, Y. edited by Kroemer, O., Agrawal, P., Burgard, W. JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2024
  • D<SUP>3</SUP>Fields: Dynamic 3D Descriptor Fields for Zero-Shot Generalizable Rearrangement Wang, Y., Zhang, M., Li, Z., Kelestemur, T., Driggs-Campbell, K., Wu, J., Fei-Fei, L., Li, Y. edited by Kroemer, O., Agrawal, P., Burgard, W. JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2024
  • TRANSIC: Sim-to-Real Policy Transfer by Learning from Online Correction Jiang, Y., Wang, C., Zhang, R., Wu, J., Fei-Fei, L. edited by Kroemer, O., Agrawal, P., Burgard, W. JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2024
  • Learning to Design 3D Printable Adaptations on Everyday Objects for Robot Manipulation Guo, M., Liu, Z., Tian, S., Xie, Z., Wu, J., Liu, C., IEEE IEEE. 2024: 824-830
  • BEHAVIOR Vision Suite: Customizable Dataset Generation via Simulation Ge, Y., Tang, Y., Xu, J., Gokmen, C., Li, C., Ai, W., Martinez, B., Aydin, A., Anvari, M., Chakravarthy, A. K., Yu, H., Wong, J., Srivastava, S., Lee, S., Zhang, S., Itti, L., Li, Y., Martin-Martins, R., Liu, M., Zhang, P., Zhang, R., Fei-Fei, L., Wu, J., IEEE IEEE COMPUTER SOC. 2024: 22401-22412
  • Open X-Embodiment: Robotic Learning Datasets and RT-X Models O'Neill, A., Rehman, A., Gupta, A., Maddukuri, A., Gupta, A., Padalkar, A., Lee, A., Pooley, A., Gupta, A., Mandlekar, A., Jain, A., Tung, A., Bewley, A., Herzog, A., Irpan, A., Khazatsky, A., Rai, A., Gupta, A., Wang, A., Kolobov, A., Singh, A., Garg, A., Kembhavi, A., Xie, A., Brohan, A., Raffin, A., Sharma, A., Yavary, A., Jain, A., Balakrishna, A., Wahid, A., Burgess-Limerick, B., Kim, B., Scholkopf, B., Wulfe, B., Ichter, B., Lu, C., Xu, C., Le, C., Finn, C., Wang, C., Xu, C., Chi, C., Huang, C., Chan, C., Agia, C., Pan, C., Fu, C., Devin, C., Xu, D., Morton, D., Driess, D., Chen, D., Pathak, D., Shah, D., Buchler, D., Jayaraman, D., Kalashnikov, D., Sadigh, D., Johns, E., Foster, E., Liu, F., Ceola, F., Xia, F., Zhao, F., Frujeri, F., Stulp, F., Zhou, G., Sukhatme, G. S., Salhotra, G., Yan, G., Feng, G., Schiavi, G., Berseth, G., Kahn, G., Yang, G., Wang, G., Su, H., Fang, H., Shi, H., Bao, H., Ben Amor, H., Christensen, H., Furuta, H., Bharadhwaj, H., Walke, H., Fang, H., Ha, H., Mordatch, I., Radosavovic, I., Leal, I., Liang, J., Abou-Chakra, J., Kim, J., Drake, J., Peters, J., Schneider, J., Hsu, J., Vakil, J., Bohg, J., Bingham, J., Wu, J., Gao, J., Hu, J., Wu, J., Wu, J., Sun, J., Luo, J., Gu, J., Tan, J., Oh, J., Wu, J., Lu, J., Yang, J., Malik, J., Silverio, J., Hejna, J., Booher, J., Tompson, J., Yang, J., Salvador, J., Lim, J. J., Han, J., Wang, K., Rao, K., Pertsch, K., Hausman, K., Go, K., Gopalakrishnan, K., Goldberg, K., Byrne, K., Oslund, K., Kawaharazuka, K., Black, K., Lin, K., Zhang, K., Ehsani, K., Lekkala, K., Ellis, K., Rana, K., Srinivasan, K., Fang, K., Singh, K., Zeng, K., Hatch, K., Hsu, K., Itti, L., Chen, L., Pinto, L., Li Fei-Fei, Tan, L., Fan, L., Ott, L., Lee, L., Weihs, L., Chen, M., Lepert, M., Memmel, M., Tomizuka, M., Itkina, M., Castro, M., Spero, M., Du, M., Ahn, M., Yip, M. C., Zhang, M., Ding, M., Heo, M., Srirama, M., Sharma, M., Kim, M., Kanazawa, N., Hansen, N., Heess, N., Joshi, N. J., Suenderhauf, N., Liu, N., Di Palo, N., Shafiullah, N., Mees, O., Kroemer, O., Bastani, O., Sanketi, P. R., Miller, P., Yin, P., Wohlhart, P., Xu, P., Fagan, P., Mitrano, P., Sermanet, P., Abbeel, P., Sundaresan, P., Chen, Q., Vuong, Q., Rafailov, R., Tian, R., Doshi, R., Martin-Martin, R., Baijal, R., Scalise, R., Hendrix, R., Lin, R., Qian, R., Zhang, R., Mendonca, R., Shah, R., Hoque, R., Julian, R., Bustamante, S., Kirmani, S., Levine, S., Lin, S., Moore, S., Bahl, S., Dass, S., Sonawani, S., Tulsiani, S., Song, S., Xu, S., Haldar, S., Karamcheti, S., Adebola, S., Guist, S., Nasiriany, S., Schaal, S., Welker, S., Tian, S., Ramamoorthy, S., Dasari, S., Belkhale, S., Park, S., Nair, S., Mirchandani, S., Osa, T., Gupta, T., Harada, T., Matsushima, T., Xiao, T., Kollar, T., Yu, T., Ding, T., Davchev, T., Zhao, T. Z., Armstrong, T., Darrell, T., Chung, T., Jain, V., Kumar, V., Vanhoucke, V., Zhan, W., Zhou, W., Burgard, W., Chen, X., Chen, X., Wang, X., Zhu, X., Geng, X., Liu, X., Xu Liangwei, Li, X., Pang, Y., Lu, Y., Ma, Y., Kim, Y., Chebotar, Y., Zhou, Y., Zhu, Y., Wu, Y., Xu, Y., Wang, Y., Bisk, Y., Dou, Y., Cho, Y., Lee, Y., Cui, Y., Cao, Y., Wu, Y., Tang, Y., Zhu, Y., Zhang, Y., Jiang, Y., Li, Y., Li, Y., Iwasawa, Y., Matsuo, Y., Ma, Z., Xu, Z., Cui, Z., Zhang, Z., Fu, Z., Lin, Z., IEEE IEEE. 2024: 6892-6903
  • Hearing Anything Anywhere Wang, M., Sawata, R., Clark, S., Gao, R., Wu, S., Wu, J., IEEE IEEE COMPUTER SOC. 2024: 11790-11799
  • Learning the 3D Fauna of the Web Li, Z., Litvak, D., Li, R., Zhang, Y., Jakab, T., Rupprecht, C., Wu, S., Vedaldi, A., Wu, J., IEEE IEEE COMPUTER SOC. 2024: 9752-9762
  • ZeroNVS: Zero-Shot 360-Degree View Synthesis from a Single Image Sargent, K., Li, Z., Shah, T., Herrmann, C., Yu, H., Zhang, Y., Chan, E., Lagun, D., Li Fei-Fei, Sun, D., Wu, J., IEEE IEEE COMPUTER SOC. 2024: 9420-9429
  • SkyScript: A Large and Semantically Diverse Vision-Language Dataset for Remote Sensing Wang, Z., Prabha, R., Huang, T., Wu, J., Rajagopal, R. edited by Dy, J., Natarajan, S., Wooldridge, M. ASSOC ADVANCEMENT ARTIFICIAL INTELLIGENCE. 2024: 5805-5813
  • ULIP-2: Towards Scalable Multimodal Pre-training for 3D Understanding Xue, L., Yu, N., Zhang, S., Panagopoulou, A., Li, J., Martin-Martin, R., Wu, J., Xiong, C., Xu, R., Niebles, J., Savarese, S., IEEE Comp Soc IEEE COMPUTER SOC. 2024: 27081-27091
  • RoboCraft: Learning to see, simulate, and shape elasto-plastic objects in 3D with graph networks INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH Shi, H., Xu, H., Huang, Z., Li, Y., Wu, J. 2023
  • Editing Motion Graphics Video via Motion Vectorization and Transformation ACM TRANSACTIONS ON GRAPHICS Zhang, S., Ma, J., Wu, J., Ritchie, D., Agrawala, M. 2023; 42 (6)

    View details for DOI 10.1145/3618316

    View details for Web of Science ID 001139790400057

  • Fluid Simulation on Neural Flow Maps ACM TRANSACTIONS ON GRAPHICS Deng, Y., Yu, H., Zhang, D., Wu, J., Zhu, B. 2023; 42 (6)

    View details for DOI 10.1145/3618392

    View details for Web of Science ID 001139790400076

  • Object Motion Guided Human Motion Synthesis ACM TRANSACTIONS ON GRAPHICS Li, J., Wu, J., Liu, C. 2023; 42 (6)

    View details for DOI 10.1145/3618333

    View details for Web of Science ID 001139790400025

  • Differentiable Physics Simulation of Dynamics-Augmented Neural Objects IEEE ROBOTICS AND AUTOMATION LETTERS Le Cleac'h, S., Yu, H., Guo, M., Howell, T., Gao, R., Wu, J., Manchester, Z., Schwager, M. 2023; 8 (5): 2780-2787
  • Rendering Humans from Object-Occluded Monocular Videos Xiang, T., Sun, A., Wu, J., Adeli, E., Fei-Fei, L., IEEE IEEE COMPUTER SOC. 2023: 3216-3227
  • Learning Rational Subgoals from Demonstrations and Instructions Luo, Z., Mao, J., Wu, J., Lozano-Perez, T., Tenenbaum, J. B., Kaelbling, L. edited by Williams, B., Chen, Y., Neville, J. ASSOC ADVANCEMENT ARTIFICIAL INTELLIGENCE. 2023: 12068-12078
  • Benchmarking Rigid Body Contact Models Guo, M., Jiang, Y., Spielberg, A., Wu, J., Liu, K. edited by Pappas, G. J., Matni, N., Morari, M. JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2023
  • Can Visual Scratchpads With Diagrammatic Abstractions Augment LLM Reasoning? Hsu, J., Poesia, G., Wu, J., Goodman, N. D. edited by Antoran, J., Blaas, A., Buchanan, K., Feng, F., Fortuin, Ghalebikesabi, S., Kriegler, A., Mason, Rohde, D., Ruiz, F. J., Uelwer, T., Xie, Y., Yang, R. JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2023: 21-28
  • Compositional Diffusion-Based Continuous Constraint Solvers Yang, Z., Mao, J., Du, Y., Wu, J., Tenenbaum, J. B., Lozano-Perez, T., Kaelbling, L. edited by Tan, J., Toussaint, M., Darvish, K. JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2023
  • Composable Part-Based Manipulation Liu, W., Mao, J., Hsu, J., Hermans, T., Garg, A., Wu, J. edited by Tan, J., Toussaint, M., Darvish, K. JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2023
  • Inferring Hybrid Neural Fluid Fields from Videos Yu, H., Zheng, Y., Gao, Y., Deng, Y., Zhu, B., Wu, J. edited by Oh, A., Neumann, T., Globerson, A., Saenko, K., Hardt, M., Levine, S. NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2023
  • Disentanglement via Latent Quantization Hsu, K., Dorrell, W., Whittington, J. C. R., Wu, J., Finn, C. edited by Oh, A., Neumann, T., Globerson, A., Saenko, K., Hardt, M., Levine, S. NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2023
  • Ego-Body Pose Estimation via Ego-Head Pose Estimation Li, J., Liu, C., Wu, J., IEEE IEEE COMPUTER SOC. 2023: 17142-17151
  • Stanford-ORB: A Real-World 3D Object Inverse Rendering Benchmark Kuang, Z., Zhang, Y., Yu, H., Agarwala, S., Wu, S., Wu, J. edited by Oh, A., Neumann, T., Globerson, A., Saenko, K., Hardt, M., Levine, S. NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2023
  • What's <i>Left</i>? Concept Grounding with Logic-Enhanced Foundation Models Hsu, J., Mao, J., Tenenbaum, J. B., Wu, J. edited by Oh, A., Neumann, T., Globerson, A., Saenko, K., Hardt, M., Levine, S. NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2023
  • NS3D: Neuro-Symbolic Grounding of 3D Objects and Relations Hsu, J., Mao, J., Wu, J., IEEE IEEE COMPUTER SOC. 2023: 2614-2623
  • ULIP: Learning a Unified Representation of Language, Images, and Point Clouds for 3D Understanding Xue, L., Gao, M., Xing, C., Martin-Martin, R., Wu, J., Xiong, C., Xu, R., Niebles, J., Savarese, S., IEEE IEEE COMPUTER SOC. 2023: 1179-1189
  • 3D Copy-Paste: Physically Plausible Object Insertion for Monocular 3D Detection Ge, Y., Yu, H., Zhao, C., Guo, Y., Huang, X., Ren, L., Itti, L., Wu, J. edited by Oh, A., Neumann, T., Globerson, A., Saenko, K., Hardt, M., Levine, S. NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2023
  • Accidental Light Probes Yu, H., Agarwala, S., Herrmann, C., Szeliski, R., Snavely, N., Wu, J., Sun, D., IEEE IEEE COMPUTER SOC. 2023: 12521-12530
  • Multi-Object Manipulation via Object-Centric Neural Scattering Functions Tian, S., Cai, Y., Yu, H., Zakharov, S., Liu, K., Gaidon, A., Li, Y., Wu, J., IEEE IEEE COMPUTER SOC. 2023: 9021-9031
  • PyPose: A Library for Robot Learning with Physics-based Optimization Wang, C., Gao, D., Xu, K., Geng, J., Hu, Y., Qiu, Y., Li, B., Yang, F., Moon, B., Pandey, A., Aryan, Xu, J., Wu, T., He, H., Huang, D., Ren, Z., Zhao, S., Fu, T., Reddy, P., Lin, X., Wang, W., Shi, J., Talak, R., Cao, K., Du, Y., Wang, H., Yu, H., Wang, S., Chen, S., Kashyap, A., Bandaru, R., Dantu, K., Wu, J., Xie, L., Carlone, L., Hutter, M., Scherer, S., IEEE IEEE COMPUTER SOC. 2023: 22024-22034
  • Tree-Structured Shading Decomposition Geng, C., Yu, H., Zhang, S., Agrawala, M., Wu, J., IEEE IEEE COMPUTER SOC. 2023: 488-498
  • Model-Based Control with Sparse Neural Dynamics Liu, Z., Zhou, G., He, J., Marcucci, T., Li Fei-Fei, Wu, J., Li, Y. edited by Oh, A., Neumann, T., Globerson, A., Saenko, K., Hardt, M., Levine, S. NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2023
  • Learning to Design and Use Tools for Robotic Manipulation Liu, Z., Tian, S., Guo, M., Liu, C., Wu, J. edited by Tan, J., Toussaint, M., Darvish, K. JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2023
  • VoxPoser: Composable 3D Value Maps for Robotic Manipulation with Language Models Huang, W., Wang, C., Zhang, R., Li, Y., Wu, J., Fei-Fei, L. edited by Tan, J., Toussaint, M., Darvish, K. JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2023
  • NOIR: Neural Signal Operated Intelligent Robots for Everyday Activities Zhang, R., Lee, S., Hwang, M., Hiranaka, A., Wang, C., Ai, W., Tan, J., Gupta, S., Hao, Y., Levine, G., Gao, R., Norcia, A., Li Fei-Fei, Wu, J. edited by Tan, J., Toussaint, M., Darvish, K. JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2023
  • Siamese Masked Autoencoders Gupta, A., Wu, J., Deng, J., Li Fei-Fei edited by Oh, A., Neumann, T., Globerson, A., Saenko, K., Hardt, M., Levine, S. NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2023
  • Learning Sequential Acquisition Policies for Robot-Assisted Feeding Sundaresan, P., Wu, J., Sadigh, D. edited by Tan, J., Toussaint, M., Darvish, K. JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2023
  • Holistic Evaluation of Text-to-Image Models Lee, T., Yasunaga, M., Meng, C., Mai, Y., Park, J., Gupta, A., Zhang, Y., Narayanan, D., Teufel, H., Bellagente, M., Kang, M., Park, T., Leskovec, J., Zhu, J., Li Fei-Fei, Wu, J., Ermon, S., Liang, P. edited by Oh, A., Neumann, T., Globerson, A., Saenko, K., Hardt, M., Levine, S. NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2023
  • 3D Neural Field Generation using Triplane Diffusion Shue, J., Chan, E., Po, R., Ankner, Z., Wu, J., Wetzstein, G., IEEE IEEE COMPUTER SOC. 2023: 20875-20886
  • CIRCLE: Capture In Rich Contextual Environments Araujo, J., Li, J., Vetrivel, K., Agarwal, R., Wu, J., Gopinath, D., Clegg, A., Liu, C., IEEE IEEE COMPUTER SOC. 2023: 21211-21221
  • Primitive Skill-based Robot Learning from Human Evaluative Feedback Hiranaka, A., Hwang, M., Lee, S., Wang, C., Fei-Fei, L., Wu, J., Zhang, R., IEEE IEEE. 2023: 7817-7824
  • SOUNDCAM: A Dataset for Finding Humans Using Room Acoustics Wang, M., Clarke, S., Wang, J., Gao, R., Wu, J. edited by Oh, A., Neumann, T., Globerson, A., Saenko, K., Hardt, M., Levine, S. NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2023
  • STAP: Sequencing Task-Agnostic Policies Agia, C., Migimatsu, T., Wu, J., Bohg, J., IEEE IEEE. 2023: 7951-7958
  • The OBJECTFOLDER BENCHMARK: Multisensory Learning with <i>Neural</i> and <i>Real</i> Objects Gao, R., Dou, Y., Li, H., Agarwal, T., Bohg, J., Li, Y., Fei-Fei, L., Wu, J., IEEE IEEE COMPUTER SOC. 2023: 17276-17286
  • Task-Driven Graph Attention for Hierarchical Relational Object Navigation Lingelbach, M., Li, C., Hwang, M., Kurenkov, A., Lou, A., Martin-Martin, R., Zhang, R., Li Fei-Fei, Wu, J., IEEE IEEE. 2023: 886-893
  • SONICVERSE: A Multisensory Simulation Platform for Embodied Household Agents that See and Hear Gao, R., Li, H., Dharan, G., Wang, Z., Li, C., Xia, F., Savarese, S., Fei-Fei, L., Wu, J., IEEE IEEE. 2023: 704-711
  • Putting People in Their Place: Affordance-Aware Human Insertion into Scenes Kulal, S., Brooks, T., Aiken, A., Wu, J., Yang, J., Lu, J., Efros, A. A., Singh, K., IEEE IEEE COMPUTER SOC. 2023: 17089-17099
  • VQ3D: Learning a 3D-Aware Generative Model on ImageNet Sargent, K., Koh, J., Zhang, H., Chang, H., Herrmann, C., Srinivasan, P., Wu, J., Sun, D., IEEE IEEE COMPUTER SOC. 2023: 4217-4227
  • Scene Synthesis from Human Motion Ye, S., Wang, Y., Li, J., Park, D., Liu, C., Xu, H., Wu, J. edited by Spencer, S. N. ASSOC COMPUTING MACHINERY. 2022
  • Unsupervised Segmentation in Real-World Images via Spelke Object Inference Chen, H., Venkatesh, R., Friedman, Y., Wu, J., Tenenbaum, J. B., Yamins, D. L. K., Bear, D. M. edited by Avidan, S., Brostow, G., Cisse, M., Farinella, G. M., Hassner, T. SPRINGER INTERNATIONAL PUBLISHING AG. 2022: 719-735
  • Rotationally Equivariant 3D Object Detection Yu, H., Wu, J., Yi, L., IEEE COMP SOC IEEE COMPUTER SOC. 2022: 1446-1454
  • Translating a Visual LEGO Manual to a Machine-Executable Plan Wang, R., Zhang, Y., Mao, J., Cheng, C., Wu, J. edited by Avidan, S., Brostow, G., Cisse, M., Farinella, G. M., Hassner, T. SPRINGER INTERNATIONAL PUBLISHING AG. 2022: 677-694
  • Video Extrapolation in Space and Time Zhang, Y., Wu, J. edited by Avidan, S., Brostow, G., Cisse, M., Farinella, G. M., Hassner, T. SPRINGER INTERNATIONAL PUBLISHING AG. 2022: 313-333
  • Programmatic Concept Learning for Human Motion Description and Synthesis Kulal, S., Mao, J., Aiken, A., Wu, J., IEEE COMP SOC IEEE COMPUTER SOC. 2022: 13833-13842
  • RoboCraft: Learning to See, Simulate, and Shape Elasto-Plastic Objects with Graph Networks Shi, H., Xu, H., Huang, Z., Li, Y., Wu, J. edited by Hauser, K., Shell, D., Huang, S. RSS FOUNDATION-ROBOTICS SCIENCE & SYSTEMS FOUNDATION. 2022
  • Revisiting the "Video" in Video-Language Understanding Buch, S., Eyzaguirre, C., Gaidon, A., Wu, J., Li Fei-Fei, Niebles, J., IEEE COMP SOC IEEE COMPUTER SOC. 2022: 2907-2917
  • Repopulating Street Scenes Wang, Y., Liu, A., Tucker, R., Wu, J., Curless, B. L., Seitz, S. M., Snavely, N., IEEE COMP SOC IEEE COMPUTER SOC. 2021: 5106-5115
  • Neural Radiance Flow for 4D View Synthesis and Video Processing Du, Y., Zhang, Y., Yu, H., Tenenbaum, J. B., Wu, J., IEEE IEEE. 2021: 14304-14314
  • Learning Temporal Dynamics from Cycles in Narrated Video Epstein, D., Wu, J., Schmid, C., Sun, C., IEEE IEEE. 2021: 1460-1469
  • Augmenting Policy Learning with Routines Discovered from a Single Demonstration Zhao, Z., Gan, C., Wu, J., Guo, X., Tenenbaum, J., Assoc Advancement Artificial Intelligence ASSOC ADVANCEMENT ARTIFICIAL INTELLIGENCE. 2021: 11024-11032
  • KeypointDeformer: Unsupervised 3D Keypoint Discovery for Shape Control Jakab, T., Tucker, R., Makadia, A., Wu, J., Snavely, N., Kanazawa, A., IEEE COMP SOC IEEE COMPUTER SOC. 2021: 12778-12787
  • De-rendering the World's Revolutionary Artefacts Wu, S., Makadia, A., Wu, J., Snavely, N., Tucker, R., Kanazawa, A., IEEE COMP SOC IEEE COMPUTER SOC. 2021: 6334-6343
  • Hierarchical Motion Understanding via Motion Programs Kulal, S., Mao, J., Aiken, A., Wu, J., IEEE COMP SOC IEEE COMPUTER SOC. 2021: 6564-6572
  • pi-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis Chan, E. R., Monteiro, M., Kellnhofer, P., Wu, J., Wetzstein, G., IEEE COMP SOC IEEE COMPUTER SOC. 2021: 5795-5805
  • Learning Generative Models of 3D Structures Chaudhuri, S., Ritchie, D., Wu, J., Xu, K., Zhang, H. WILEY. 2020: 643–66

    View details for DOI 10.1111/cgf.14020

    View details for Web of Science ID 000548709600052

  • End-to-End Optimization of Scene Layout Luo, A., Zhang, Z., Wu, J., Tenenbaum, J. B., IEEE IEEE. 2020: 3753–62
  • DualSMC: Tunneling Differentiable Filtering and Planning under Continuous POMDPs Wang, Y., Liu, B., Wu, J., Zhu, Y., Du, S. S., Li Fei-Fei, Tenenbaum, J. B. edited by Bessiere, C. IJCAI-INT JOINT CONF ARTIF INTELL. 2020: 4190-4198
  • Accurate Vision-based Manipulation through Contact Reasoning Kloss, A., Bauza, M., Wu, J., Tenenbaum, J. B., Rodriguez, A., Bohg, J., IEEE IEEE. 2020: 6738-6744
  • Visual Grounding of Learned Physical Models Li, Y., Lin, T., Yi, K., Bear, D. M., Yamins, D. L. K., Wu, J., Tenenbaum, J. B., Torralba, A. edited by Daume, H., Singh, A. JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2020
  • Perspective Plane Program Induction from a Single Image Li, Y., Mao, J., Zhang, X., Freeman, W. T., Tenenbaum, J. B., Wu, J., IEEE IEEE. 2020: 4433–42
  • Video Enhancement with Task-Oriented Flow INTERNATIONAL JOURNAL OF COMPUTER VISION Xue, T., Chen, B., Wu, J., Wei, D., Freeman, W. T. 2019; 127 (8): 1106–25
  • An integrative computational architecture for object-driven cortex CURRENT OPINION IN NEUROBIOLOGY Yildirim, I., Wu, J., Kanwisher, N., Tenenbaum, J. 2019; 55: 73–81

    Abstract

    Computational architecture for object-driven cortex Objects in motion activate multiple cortical regions in every lobe of the human brain. Do these regions represent a collection of independent systems, or is there an overarching functional architecture spanning all of object-driven cortex? Inspired by recent work in artificial intelligence (AI), machine learning, and cognitive science, we consider the hypothesis that these regions can be understood as a coherent network implementing an integrative computational system that unifies the functions needed to perceive, predict, reason about, and plan with physical objects-as in the paradigmatic case of using or making tools. Our proposal draws on a modeling framework that combines multiple AI methods, including causal generative models, hybrid symbolic-continuous planning algorithms, and neural recognition networks, with object-centric, physics-based representations. We review evidence relating specific components of our proposal to the specific regions that comprise object-driven cortex, and lay out future research directions with the goal of building a complete functional and mechanistic account of this system.

    View details for DOI 10.1016/j.conb.2019.01.010

    View details for Web of Science ID 000472127600011

    View details for PubMedID 30825704

    View details for PubMedCentralID PMC6548583

  • See, feel, act: Hierarchical learning for complex manipulation skills with multisensory fusion SCIENCE ROBOTICS Fazeli, N., Oller, M., Wu, J., Wu, Z., Tenenbaum, J. B., Rodriguez, A. 2019; 4 (26)
  • Visual Concept-Metaconcept Learning Han, C., Mao, J., Gan, C., Tenenbaum, J. B., Wu, J. edited by Wallach, H., Larochelle, H., Beygelzimer, A., d'Alche-Buc, F., Fox, E., Garnett, R. NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2019
  • Combining Physical Simulators and Object-Based Networks for Control Ajay, A., Bauza, M., Wu, J., Fazeli, N., Tenenbaum, J. B., Rodriguez, A., Kaelbling, L. P., IEEE edited by Howard, A., Althoefer, K., Arai, F., Arrichiello, F., Caputo, B., Castellanos, J., Hauser, K., Isler, Kim, J., Liu, H., Oh, P., Santos, Scaramuzza, D., Ude, A., Voyles, R., Yamane, K., Okamura, A. IEEE. 2019: 3217–23
  • Propagation Networks for Model-Based Control Under Partial Observation Li, Y., Wu, J., Zhu, J., Tenenbaum, J. B., Torralba, A., Tedrake, R., IEEE edited by Howard, A., Althoefer, K., Arai, F., Arrichiello, F., Caputo, B., Castellanos, J., Hauser, K., Isler, Kim, J., Liu, H., Oh, P., Santos, Scaramuzza, D., Ude, A., Voyles, R., Yamane, K., Okamura, A. IEEE. 2019: 1205–11
  • ChainQueen: A Real-Time Differentiable Physical Simulator for Soft Robotics Hu, Y., Liu, J., Spielberg, A., Tenenbaum, J. B., Freeman, W. T., Wu, J., Rus, D., Matusik, W., IEEE edited by Howard, A., Althoefer, K., Arai, F., Arrichiello, F., Caputo, B., Castellanos, J., Hauser, K., Isler, Kim, J., Liu, H., Oh, P., Santos, Scaramuzza, D., Ude, A., Voyles, R., Yamane, K., Okamura, A. IEEE. 2019: 6265–71
  • Program-Guided Image Manipulators Mao, J., Zhang, X., Li, Y., Freeman, W. T., Tenenbaum, J. B., Wu, J., IEEE IEEE COMPUTER SOC. 2019: 4029–38
  • Modeling Expectation Violation in Intuitive Physics with Coarse Probabilistic Object Representations Smith, K. A., Mei, L., Yao, S., Wu, J., Spelke, E., Tenenbaum, J. B., Ullman, T. D. edited by Wallach, H., Larochelle, H., Beygelzimer, A., d'Alche-Buc, F., Fox, E., Garnett, R. NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2019
  • Learning Sight from Sound: Ambient Sound Provides Supervision for Visual Learning INTERNATIONAL JOURNAL OF COMPUTER VISION Owens, A., Wu, J., McDermott, J. H., Freeman, W. T., Torralba, A. 2018; 126 (10): 1120–37
  • 3D Interpreter Networks for Viewer-Centered Wireframe Modeling INTERNATIONAL JOURNAL OF COMPUTER VISION Wu, J., Xue, T., Lim, J. J., Tian, Y., Tenenbaum, J. B., Torralba, A., Freeman, W. T. 2018; 126 (9): 1009–26
  • Augmenting Physical Simulators with Stochastic Neural Networks: Case Study of Planar Pushing and Bouncing Ajay, A., Wu, J., Fazeli, N., Bauza, M., Kaelbling, L. P., Tenenbaum, J. B., Rodriguez, A., Kosecka, J. edited by Maciejewski, A. A., Okamura, A., Bicchi, A., Stachniss, C., Song, D. Z., Lee, D. H., Chaumette, F., Ding, H., Li, J. S., Wen, J., Roberts, J., Masamune, K., Chong, N. Y., Amato, N., Tsagwarakis, N., Rocco, P., Asfour, T., Chung, W. K., Yasuyoshi, Y., Sun, Y., Maciekeski, T., Althoefer, K., AndradeCetto, J., Chung, W. K., Demircan, E., Dias, J., Fraisse, P., Gross, R., Harada, H., Hasegawa, Y., Hayashibe, M., Kiguchi, K., Kim, K., Kroeger, T., Li, Y., Ma, S., Mochiyama, H., Monje, C. A., Rekleitis, Roberts, R., Stulp, F., Tsai, C. H., Zollo, L. IEEE. 2018: 3066–73
  • Unsupervised Learning of Latent Physical Properties Using Perception-Prediction Networks Zheng, D., Luo, V., Wu, J., Tenenbaum, J. B. edited by Globerson, A., Silva, R. AUAI PRESS. 2018: 497–507
  • Physical Primitive Decomposition Liu, Z., Freeman, W. T., Tenenbaum, J. B., Wu, J. edited by Ferrari, Hebert, M., Sminchisescu, C., Weiss, Y. SPRINGER INTERNATIONAL PUBLISHING AG. 2018: 3–20
  • Seeing Tree Structure from Vibration Xue, T., Wu, J., Zhang, Z., Zhang, C., Tenenbaum, J. B., Freeman, W. T. edited by Ferrari, Hebert, M., Sminchisescu, C., Weiss, Y. SPRINGER INTERNATIONAL PUBLISHING AG. 2018: 762–79
  • MoSculp: Interactive Visualization of Shape and Time Zhang, X., Dekel, T., Xue, T., Owens, A., He, Q., Wu, J., Mueller, S., Freeman, W. T., Assoc Comp Machinery ASSOC COMPUTING MACHINERY. 2018: 275–85
  • Learning to Reconstruct Shapes from Unseen Classes Zhang, X., Zhang, Z., Zhang, C., Tenenbaum, J. B., Freeman, W. T., Wu, J. edited by Bengio, S., Wallach, H., Larochelle, H., Grauman, K., CesaBianchi, N., Garnett, R. NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2018
  • Attention Clusters: Purely Attention Based Local Feature Integration for Video Classification Long, X., Gan, C., de Melo, G., Wu, J., Liu, X., Wen, S., IEEE IEEE. 2018: 7834–43
  • Pix3D: Dataset and Methods for Single-Image 3D Shape Modeling Sun, X., Wu, J., Zhang, X., Zhang, Z., Zhang, C., Xue, T., Tenenbaum, J. B., Freeman, W. T., IEEE IEEE. 2018: 2974–83
  • 3D Shape Perception from Monocular Vision, Touch, and Shape Priors Wang, S., Wu, J., Sun, X., Yuan, W., Freeman, W. T., Tenenbaum, J. B., Adelson, E. H., Kosecka, J. edited by Maciejewski, A. A., Okamura, A., Bicchi, A., Stachniss, C., Song, D. Z., Lee, D. H., Chaumette, F., Ding, H., Li, J. S., Wen, J., Roberts, J., Masamune, K., Chong, N. Y., Amato, N., Tsagwarakis, N., Rocco, P., Asfour, T., Chung, W. K., Yasuyoshi, Y., Sun, Y., Maciekeski, T., Althoefer, K., AndradeCetto, J., Chung, W. K., Demircan, E., Dias, J., Fraisse, P., Gross, R., Harada, H., Hasegawa, Y., Hayashibe, M., Kiguchi, K., Kim, K., Kroeger, T., Li, Y., Ma, S., Mochiyama, H., Monje, C. A., Rekleitis, Roberts, R., Stulp, F., Tsai, C. H., Zollo, L. IEEE. 2018: 1606–13
  • 3D-Aware Scene Manipulation via Inverse Graphics Yao, S., Hsu, T., Zhu, J., Wu, J., Torralba, A., Freeman, W. T., Tenenbaum, J. B. edited by Bengio, S., Wallach, H., Larochelle, H., Grauman, K., CesaBianchi, N., Garnett, R. NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2018
  • Visual Object Networks: Image Generation with Disentangled 3D Representation Zhu, J., Zhang, Z., Zhang, C., Wu, J., Torralba, A., Tenenbaum, J. B., Freeman, W. T. edited by Bengio, S., Wallach, H., Larochelle, H., Grauman, K., CesaBianchi, N., Garnett, R. NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2018
  • Learning to Exploit Stability for 3D Scene Parsing Du, Y., Liu, Z., Basevi, H., Leonardis, A., Freeman, W. T., Tenenbaum, J. B., Wu, J. edited by Bengio, S., Wallach, H., Larochelle, H., Grauman, K., CesaBianchi, N., Garnett, R. NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2018
  • Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language Understanding Yi, K., Wu, J., Gan, C., Torralba, A., Kohli, P., Tenenbaum, J. B. edited by Bengio, S., Wallach, H., Larochelle, H., Grauman, K., CesaBianchi, N., Garnett, R. NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2018
  • Shape and Material from Sound Zhang, Z., Li, Q., Huang, Z., Wu, J., Tenenbaum, J. B., Freeman, W. T. edited by Guyon, Luxburg, U. V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R. NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2017
  • MarrNet: 3D Shape Reconstruction via 2.5D Sketches Wu, J., Wang, Y., Xue, T., Sun, X., Freeman, W. T., Tenenbaum, J. B. edited by Guyon, Luxburg, U. V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R. NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2017
  • Synthesizing 3D Shapes via Modeling Multi-View Depth Maps and Silhouettes with Deep Generative Networks Soltani, A., Huang, H., Wu, J., Kulkarni, T. D., Tenenbaum, J. B., IEEE IEEE. 2017: 2511–19
  • Neural Scene De-rendering Wu, J., Tenenbaum, J. B., Kohli, P., IEEE IEEE. 2017: 7035–43
  • Raster-to-Vector: Revisiting Floorplan Transformation Liu, C., Wu, J., Kohli, P., Furukawa, Y., IEEE IEEE. 2017: 2214–22
  • Generative Modeling of Audible Shapes for Object Perception Zhang, Z., Wu, J., Li, Q., Huang, Z., Traer, J., McDermott, J. H., Tenenbaum, J. B., Freeman, W. T., IEEE IEEE. 2017: 1260–69
  • Self-Supervised Intrinsic Image Decomposition Janner, M., Wu, J., Kulkarni, T. D., Yildirim, I., Tenenbaum, J. B. edited by Guyon, Luxburg, U. V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R. NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2017
  • Learning to See Physics via Visual De-animation Wu, J., Lu, E., Kohli, P., Freeman, W. T., Tenenbaum, J. B. edited by Guyon, Luxburg, U. V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R. NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2017
  • Visual Dynamics: Probabilistic Future Frame Synthesis via Cross Convolutional Networks Xue, T., Wu, J., Bouman, K. L., Freeman, W. T. edited by Lee, D. D., Sugiyama, M., Luxburg, U. V., Guyon, Garnett, R. NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2016
  • Ambient Sound Provides Supervision for Visual Learning Owens, A., Wu, J., McDermott, J. H., Freeman, W. T., Torralba, A. edited by Leibe, B., Matas, J., Sebe, N., Welling, M. SPRINGER INTERNATIONAL PUBLISHING AG. 2016: 801–16
  • Single Image 3D Interpreter Network Wu, J., Xue, T., Lim, J. J., Tian, Y., Tenenbaum, J. B., Torralba, A., Freeman, W. T. edited by Leibe, B., Matas, J., Sebe, N., Welling, M. SPRINGER INT PUBLISHING AG. 2016: 365–82
  • Unsupervised Object Class Discovery via Saliency-Guided Multiple Class Learning IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE Zhu, J., Wu, J., Xu, Y., Chang, E., Tu, Z. 2015; 37 (4): 862–75

    Abstract

    In this paper, we tackle the problem of common object (multiple classes) discovery from a set of input images, where we assume the presence of one object class in each image. This problem is, loosely speaking, unsupervised since we do not know a priori about the object type, location, and scale in each image. We observe that the general task of object class discovery in a fully unsupervised manner is intrinsically ambiguous; here we adopt saliency detection to propose candidate image windows/patches to turn an unsupervised learning problem into a weakly-supervised learning problem. In the paper, we propose an algorithm for simultaneously localizing objects and discovering object classes via bottom-up (saliency-guided) multiple class learning (bMCL). Our contributions are three-fold: (1) we adopt saliency detection to convert unsupervised learning into multiple instance learning, formulated as bottom-up multiple class learning (bMCL); (2) we propose an integrated framework that simultaneously performs object localization, object class discovery, and object detector training; (3) we demonstrate that our framework yields significant improvements over existing methods for multi-class object discovery and possess evident advantages over competing methods in computer vision. In addition, although saliency detection has recently attracted much attention, its practical usage for high-level vision tasks has yet to be justified. Our method validates the usefulness of saliency detection to output "noisy input" for a top-down method to extract common patterns.

    View details for DOI 10.1109/TPAMI.2014.2353617

    View details for Web of Science ID 000351213400012

    View details for PubMedID 26353299

  • Deep Multiple Instance Learning for Image Classification and Auto-Annotation Wu, J., Yu, Y., Huang, C., Yu, K., IEEE IEEE. 2015: 3460–69
  • MILCut: A Sweeping Line Multiple Instance Learning Paradigm for Interactive Image Segmentation Wu, J., Zhao, Y., Zhu, J., Luo, S., Tu, Z., IEEE IEEE. 2014: 256–63
  • Harvesting Mid-level Visual Concepts from Large-scale Internet Images Li, Q., Wu, J., Tul, Z., IEEE IEEE. 2013: 851–58
  • A classification approach to coreference in discharge summaries: 2011 i2b2 challenge JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION Xu, Y., Liu, J., Wu, J., Wang, Y., Tu, Z., Sun, J., Tsujii, J., Chang, E. 2012; 19 (5): 897–905

    Abstract

    To create a highly accurate coreference system in discharge summaries for the 2011 i2b2 challenge. The coreference categories include Person, Problem, Treatment, and Test.An integrated coreference resolution system was developed by exploiting Person attributes, contextual semantic clues, and world knowledge. It includes three subsystems: Person coreference system based on three Person attributes, Problem/Treatment/Test system based on numerous contextual semantic extractors and world knowledge, and Pronoun system based on a multi-class support vector machine classifier. The three Person attributes are patient, relative and hospital personnel. Contextual semantic extractors include anatomy, position, medication, indicator, temporal, spatial, section, modifier, equipment, operation, and assertion. The world knowledge is extracted from external resources such as Wikipedia.Micro-averaged precision, recall and F-measure in MUC, BCubed and CEAF were used to evaluate results.The system achieved an overall micro-averaged precision, recall and F-measure of 0.906, 0.925, and 0.915, respectively, on test data (from four hospitals) released by the challenge organizers. It achieved a precision, recall and F-measure of 0.905, 0.920 and 0.913, respectively, on test data without Pittsburgh data. We ranked the first out of 20 competing teams. Among the four sub-tasks on Person, Problem, Treatment, and Test, the highest F-measure was seen for Person coreference.This system achieved encouraging results. The Person system can determine whether personal pronouns and proper names are coreferent or not. The Problem/Treatment/Test system benefits from both world knowledge in evaluating the similarity of two mentions and contextual semantic extractors in identifying semantic clues. The Pronoun system can automatically detect whether a Pronoun mention is coreferent to that of the other four types. This study demonstrates that it is feasible to accomplish the coreference task in discharge summaries.

    View details for DOI 10.1136/amiajnl-2011-000734

    View details for Web of Science ID 000307934600030

    View details for PubMedID 22505762

    View details for PubMedCentralID PMC3422828

  • Unsupervised Object Class Discovery via Saliency-Guided Multiple Class Learning Zhu, J., Wu, J., Wei, Y., Chang, E., Tu, Z., IEEE IEEE. 2012: 3218–25