Bio


Anyi Rao is a Postdoctoral Scholar at Stanford. He studies reliable human-centered AI for creativity and film, focusing on intelligent media editing and creation, semantic and cinematic analysis, aiming to build connections between AI and humans for collaborative intelligence and unleash human creativity and productivity. His works include ControlNet, AnimateDiff, MovieNet, Virtual Studio, Shoot360, and CityNeRF, with a Marr Prize (ICCV best paper award). He assumes a leading role in the organization of the Creative Video Editing and Understanding Workshop at CVPR24, ICCV23, the Generative Models Course at SIGGRAPH24, and the 2023 Paris AI Short Film Festival. He has research experiences at Meta Reality Lab, Vector Institute, University of Toronto, and Hong Kong University. He received his Ph.D. at MMLab in the Chinese University of Hong Kong in 2022.

Honors & Awards


  • Marr Prize (Best Paper Award), ICCV (2023)
  • Magic Grant, Brown Institue (2023)
  • Research Funding by Prime Video, Amazon (2023)
  • Grant for Organizing ICCV23 Creative Video Editing and Understanding Workshop, Pika, KAUST (2023)
  • Grant for Organizing ECCV22 Creative Video Editing and Understanding Workshop, KAUST (2022)
  • Grant for Organizing ICCV21 Creative Video Editing and Understanding Workshop, Adobe (2021)
  • Most Influential Papers, Paper Digest (2021)

Boards, Advisory Committees, Professional Organizations


  • Leading Organizer, SIGGRAPH Course on Generative Models for Visual Content Editing and Creation (2024 - 2024)
  • Leading/Key Organizer, CVPR2024/ICCV2023/ECCV2022/ICCV2021 Workshop AI for Creative Video Editing and Understanding (2021 - Present)
  • Founder, Virtual Film Studio https://virtualfilmstudio.github.io/ (2023 - Present)
  • Co-Founder, City-Super https://city-super.github.io/ (2021 - Present)
  • Co-Founder, MovieNet https://movienet.github.io/ (2020 - Present)
  • Program Committee Member and Reviewer, CVPR, ICCV, ECCV, ACCV, SIGGRAPH, SIGGRAPH Asia, CHI, UIST, MM, NeurIPS, ICML, ICLR, AAAI, IJCAI (2021 - Present)
  • Journal Reviewer, TPAMI, TVCG, TMM, TCSVT, IJCV (2021 - Present)

Stanford Advisors


Current Research and Scholarly Interests


Human AI for Creativity, Computer Vision, Graphics, Human-Computer Interaction

All Publications


  • HireVAE: An Online and Adaptive Factor Model Based on Hierarchical and Regime-Switch VAE International Joint Conference on Artificial Intelligence (IJCAI) Wei, Z., Rao, A., Dai, B., Lin, D. 2023
  • Self-supervised Action Representation Learning from Partial Spatio-Temporal Skeleton Sequences The AAAI Conference on Artificial Intelligence Zhou, Y., Duan, H., Rao, A., Su, B., Wang, J. 2023

    View details for DOI 10.1609/aaai.v37i3.25495

  • A Coarse-to-Fine Framework for Automatic Video Unscreen IEEE Transactions on Multimedia (TMM) Rao, A., Xu, L., Li, Z., Huang, Q., Kuang, Z., Zhang, W., Lin, D. 2022

    View details for DOI 10.1109/TMM.2022.3150177

  • AutoGPart: Intermediate Supervision Search for Generalizable 3D Part Segmentation IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Liu, X., Xu, X., Rao, A., Gan, C., Yi, L. 2022
  • BlockPlanner: City Block Generation with Vectorized Graph Representation IEEE/CVF International Conference on Computer Vision (ICCV) Xu, L., Xiangli, Y., Rao, A., Zhao, N., Dai, B., Liu, Z., Lin, D. 2021
  • Jointly Learning the Attributes and Composition of Shots for Boundary Detection in Videos IEEE Transactions on Multimedia (TMM) Jiang, X., Jin, L., Rao, A., Xu, L., Lin, D. 2021

    View details for DOI 10.1109/tmm.2021.3092143

  • A Local-to-Global Approach to Multi-Modal Movie Scene Segmentation IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Rao, A., Xu, L., Xiong, Y., Xu, G., Huang, Q., Zhou, B., Lin, D. 2020
  • A Unified Framework for Shot Type Classification Based on Subject Centric Lens European Conference on Computer Vision (ECCV) Rao, A., Wang, J., Xu, L., Jiang, X., Huang, Q., Zhou, B., Lin, D. 2020
  • Online Multi-modal Person Search in Videos European Conference on Computer Vision (ECCV) Xia, J., Rao, A., Huang, Q., Wen, J., Lin, D. 2020