Chenghan Xie
Ph.D. Student in Management Science and Engineering, admitted Autumn 2024
Bio
An idiot with hope
Current Research and Scholarly Interests
Optimization, theory & practice. Energy-aware AI, nerual-network structure & data center management.
All Publications
-
PLMSearch: Protein language model powers accurate and fast sequence search for remote homology.
Nature communications
2024; 15 (1): 2775
Abstract
Homologous protein search is one of the most commonly used methods for protein annotation and analysis. Compared to structure search, detecting distant evolutionary relationships from sequences alone remains challenging. Here we propose PLMSearch (Protein Language Model), a homologous protein search method with only sequences as input. PLMSearch uses deep representations from a pre-trained protein language model and trains the similarity prediction model with a large number of real structure similarity. This enables PLMSearch to capture the remote homology information concealed behind the sequences. Extensive experimental results show that PLMSearch can search millions of query-target protein pairs in seconds like MMseqs2 while increasing the sensitivity by more than threefold, and is comparable to state-of-the-art structure search methods. In particular, unlike traditional sequence search methods, PLMSearch can recall most remote homology pairs with dissimilar sequences but similar structures. PLMSearch is freely available at https://dmiip.sjtu.edu.cn/PLMSearch .
View details for DOI 10.1038/s41467-024-46808-5
View details for PubMedID 38555371
View details for PubMedCentralID PMC10981738
-
Sketched Newton Value Iteration for Large-Scale Markov Decision Processes
ASSOC ADVANCEMENT ARTIFICIAL INTELLIGENCE. 2024: 13936-13944
View details for Web of Science ID 001241515300107
-
Trust Region Methods for Nonconvex Stochastic Optimization beyond Lipschitz Smoothness
ASSOC ADVANCEMENT ARTIFICIAL INTELLIGENCE. 2024: 16049-16057
View details for Web of Science ID 001239983500097