Anyi Rao is a Postdoctoral Scholar at Stanford with Maneesh Agrawala. He has research experiences at Meta Reality Lab, Vector Institute, University of Toronto, Hong Kong University. He received the Ph.D. at MMLab in the Chinese University of Hong Kong in 2022, advised by Dahua Lin and Bolei Zhou. He studies human-centered AI for creativity, multimodality and film, with focuses on content generation, intelligent media editing and creation, semantic and cinematic analysis, aiming to build connections between AI and humans for collaborative intelligence and unleash human creativity and productivity. His works include ControlNet, AnimateDiff, MovieNet, Virtual Dynamic Storyboard, Shoot360, and CityNeRF.

Honors & Awards

  • Marr Prize (Best Paper Award), ICCV (2023)
  • Magic Grant, Brown Institue (2023)
  • Research Funding by Prime Video, Amazon (2023)
  • Grant for Organizing ICCV23 Creative Video Editing and Understanding Workshop, Pika, KAUST (2023)
  • Grant for Organizing ECCV22 Creative Video Editing and Understanding Workshop, KAUST (2022)
  • Grant for Organizing ICCV21 Creative Video Editing and Understanding Workshop, Adobe (2021)
  • Most Influential Papers, Paper Digest (2021)

Boards, Advisory Committees, Professional Organizations

  • Program Committee Member and Reviewer, CVPR, ICCV, ECCV, ACCV, SIGGRAPH, SIGGRAPH Asia, CHI, UIST, MM, NeurIPS, ICML, ICLR, AAAI, IJCAI (2021 - Present)
  • Leading/Key Organizer, CVPR2024/ICCV2023/ECCV2022/ICCV2021 Workshop AI for Creative Video Editing and Understanding (2021 - Present)
  • Founder, Virtual Film Studio (2023 - Present)
  • Co-Founder, City-Super (2021 - Present)
  • Co-Founder, MovieNet (2020 - Present)
  • Journal Reviewer, IEEE Transactions on Multimedia, IEEE Transactions on Visualization and Computer Graphics, IEEE Transactions on Circuits and Systems for Video Technology, International Journal of Computer Vision (2021 - Present)

Stanford Advisors

Current Research and Scholarly Interests

Human AI for Creativity, Computer Vision, Graphics, Human-Computer Interaction

All Publications

  • HireVAE: An Online and Adaptive Factor Model Based on Hierarchical and Regime-Switch VAE International Joint Conference on Artificial Intelligence (IJCAI) Wei, Z., Rao, A., Dai, B., Lin, D. 2023
  • Self-supervised Action Representation Learning from Partial Spatio-Temporal Skeleton Sequences The AAAI Conference on Artificial Intelligence Zhou, Y., Duan, H., Rao, A., Su, B., Wang, J. 2023

    View details for DOI 10.1609/aaai.v37i3.25495

  • A Coarse-to-Fine Framework for Automatic Video Unscreen IEEE Transactions on Multimedia (TMM) Rao, A., Xu, L., Li, Z., Huang, Q., Kuang, Z., Zhang, W., Lin, D. 2022

    View details for DOI 10.1109/TMM.2022.3150177

  • AutoGPart: Intermediate Supervision Search for Generalizable 3D Part Segmentation IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Liu, X., Xu, X., Rao, A., Gan, C., Yi, L. 2022
  • BlockPlanner: City Block Generation with Vectorized Graph Representation IEEE/CVF International Conference on Computer Vision (ICCV) Xu, L., Xiangli, Y., Rao, A., Zhao, N., Dai, B., Liu, Z., Lin, D. 2021
  • Jointly Learning the Attributes and Composition of Shots for Boundary Detection in Videos IEEE Transactions on Multimedia (TMM) Jiang, X., Jin, L., Rao, A., Xu, L., Lin, D. 2021

    View details for DOI 10.1109/tmm.2021.3092143

  • A Local-to-Global Approach to Multi-Modal Movie Scene Segmentation IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Rao, A., Xu, L., Xiong, Y., Xu, G., Huang, Q., Zhou, B., Lin, D. 2020
  • A Unified Framework for Shot Type Classification Based on Subject Centric Lens European Conference on Computer Vision (ECCV) Rao, A., Wang, J., Xu, L., Jiang, X., Huang, Q., Zhou, B., Lin, D. 2020
  • Online Multi-modal Person Search in Videos European Conference on Computer Vision (ECCV) Xia, J., Rao, A., Huang, Q., Wen, J., Lin, D. 2020