Stanford Advisors


All Publications


  • BossNAS Family: Block-Wisely Self-Supervised Neural Architecture Search IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE Li, C., Lin, S., Tang, T., Wang, G., Li, M., Liang, X., Chang, X. 2025; 47 (5): 3500-3514

    Abstract

    Recent advances in hand-crafted neural architectures for visual recognition underscore the pressing need to explore architecture designs comprising diverse building blocks. Concurrently, neural architecture search (NAS) methods have gained traction as a means to alleviate human efforts. Nevertheless, the question of whether NAS methods can efficiently and effectively manage diversified search spaces featuring disparate candidates, such as Convolutional Neural Networks (CNNs) and transformers, remains an open question. In this work, we introduce a novel unsupervised NAS approach called BossNAS (Block-wisely Self-supervised Neural Architecture Search), which aims to address the problem of inaccurate predictive architecture ranking caused by a large weight-sharing space while mitigating potential ranking issue caused by biased supervision. To achieve this, we factorize the search space into blocks and introduce a novel self-supervised training scheme called Ensemble Bootstrapping, to train each block separately in an unsupervised manner. In the search phase, we propose an unsupervised Population-Centric Search, optimizing the candidate architecture towards the population center. Additionally, we enhance our NAS method by integrating masked image modeling and present BossNAS++ to overcome the lack of dense supervision in our block-wise self-supervised NAS. In BossNAS++, we introduce the training technique named Masked Ensemble Bootstrapping for block-wise supernet, accompanied by a Masked Population-Centric Search scheme to promote fairer architecture selection. Our family of models, discovered through BossNAS and BossNAS++, delivers impressive results across various search spaces and datasets. Our transformer model discovered by BossNAS++ attains a remarkable accuracy of 83.2% on ImageNet with only 10.5B MAdds, surpassing DeiT-B by 1.4% while maintaining a lower computation cost. Moreover, our approach excels in architecture rating accuracy, achieving Spearman correlations of 0.78 and 0.76 on the canonical MBConv search space with ImageNet and the NATS-Bench size search space with CIFAR-100, respectively, outperforming state-of-the-art NAS methods.

    View details for DOI 10.1109/TPAMI.2025.3529517

    View details for Web of Science ID 001465416300004

    View details for PubMedID 40031006

  • No Token Left Behind: Efficient Vision Transformer via Dynamic Token Idling Xu, X., Li, C., Chen, Y., Chang, X., Liu, J., Wang, S. edited by Liu, T., Yue, L., Webb, G., Wang, D. SPRINGER-VERLAG SINGAPORE PTE LTD. 2024: 28-41
  • Automated Progressive Learning for Efficient Training of Vision Transformers Li, C., Zhuang, B., Wang, G., Liang, X., Chang, X., Yang, Y., IEEE COMP SOC IEEE COMPUTER SOC. 2022: 12476-12486
  • Pi-NAS: Improving Neural Architecture Search by Reducing Supernet Training Consistency Shift Peng, J., Zhang, J., Li, C., Wang, G., Liang, X., Lin, L., IEEE IEEE. 2021: 12334-12344
  • BossNAS: Exploring Hybrid CNN-transformers with Block-wisely Self-supervised Neural Architecture Search Li, C., Tang, T., Wang, G., Peng, J., Wang, B., Liang, X., Chang, X., IEEE IEEE. 2021: 12261-12271
  • Dynamic Slimmable Network Li, C., Wang, G., Wang, B., Liang, X., Li, Z., Chang, X., IEEE COMP SOC IEEE COMPUTER SOC. 2021: 8603-8613
  • Block-wisely Supervised Neural Architecture Search with Knowledge Distillation Li, C., Peng, J., Yuan, L., Wang, G., Liang, X., Lin, L., Chang, X., IEEE IEEE COMPUTER SOC. 2020: 1986-1995
  • Knowledge driven temporal activity localization JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION Li, C., Li, Z., Ge, Z., Li, M. 2019; 64