All Publications
-
BossNAS Family: Block-Wisely Self-Supervised Neural Architecture Search
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
2025; 47 (5): 3500-3514
Abstract
Recent advances in hand-crafted neural architectures for visual recognition underscore the pressing need to explore architecture designs comprising diverse building blocks. Concurrently, neural architecture search (NAS) methods have gained traction as a means to alleviate human efforts. Nevertheless, the question of whether NAS methods can efficiently and effectively manage diversified search spaces featuring disparate candidates, such as Convolutional Neural Networks (CNNs) and transformers, remains an open question. In this work, we introduce a novel unsupervised NAS approach called BossNAS (Block-wisely Self-supervised Neural Architecture Search), which aims to address the problem of inaccurate predictive architecture ranking caused by a large weight-sharing space while mitigating potential ranking issue caused by biased supervision. To achieve this, we factorize the search space into blocks and introduce a novel self-supervised training scheme called Ensemble Bootstrapping, to train each block separately in an unsupervised manner. In the search phase, we propose an unsupervised Population-Centric Search, optimizing the candidate architecture towards the population center. Additionally, we enhance our NAS method by integrating masked image modeling and present BossNAS++ to overcome the lack of dense supervision in our block-wise self-supervised NAS. In BossNAS++, we introduce the training technique named Masked Ensemble Bootstrapping for block-wise supernet, accompanied by a Masked Population-Centric Search scheme to promote fairer architecture selection. Our family of models, discovered through BossNAS and BossNAS++, delivers impressive results across various search spaces and datasets. Our transformer model discovered by BossNAS++ attains a remarkable accuracy of 83.2% on ImageNet with only 10.5B MAdds, surpassing DeiT-B by 1.4% while maintaining a lower computation cost. Moreover, our approach excels in architecture rating accuracy, achieving Spearman correlations of 0.78 and 0.76 on the canonical MBConv search space with ImageNet and the NATS-Bench size search space with CIFAR-100, respectively, outperforming state-of-the-art NAS methods.
View details for DOI 10.1109/TPAMI.2025.3529517
View details for Web of Science ID 001465416300004
View details for PubMedID 40031006
-
No Token Left Behind: Efficient Vision Transformer via Dynamic Token Idling
edited by Liu, T., Yue, L., Webb, G., Wang, D.
SPRINGER-VERLAG SINGAPORE PTE LTD. 2024: 28-41
View details for DOI 10.1007/978-981-99-8388-9_3
View details for Web of Science ID 001148047100003
-
Automated Progressive Learning for Efficient Training of Vision Transformers
IEEE COMPUTER SOC. 2022: 12476-12486
View details for DOI 10.1109/CVPR52688.2022.01216
View details for Web of Science ID 000870759105055
-
Pi-NAS: Improving Neural Architecture Search by Reducing Supernet Training Consistency Shift
IEEE. 2021: 12334-12344
View details for DOI 10.1109/ICCV48922.2021.01213
View details for Web of Science ID 000798743202051
-
BossNAS: Exploring Hybrid CNN-transformers with Block-wisely Self-supervised Neural Architecture Search
IEEE. 2021: 12261-12271
View details for DOI 10.1109/ICCV48922.2021.01206
View details for Web of Science ID 000798743202044
-
Dynamic Slimmable Network
IEEE COMPUTER SOC. 2021: 8603-8613
View details for DOI 10.1109/CVPR46437.2021.00850
View details for Web of Science ID 000739917308082
-
Block-wisely Supervised Neural Architecture Search with Knowledge Distillation
IEEE COMPUTER SOC. 2020: 1986-1995
View details for DOI 10.1109/CVPR42600.2020.00206
View details for Web of Science ID 000620679502025
-
Knowledge driven temporal activity localization
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION
2019; 64
View details for DOI 10.1016/j.jvcir.2019.102628
View details for Web of Science ID 000492798600035