Honors & Awards


  • CEHG Fellow, CEHG, Stanford (October 2023)

Stanford Advisors


All Publications


  • Scoring information integration with statistical quality control enhanced cross-run analysis of data-independent acquisition proteomics data. Communications chemistry Gao, M., Gupta, S., Yang, W., Yu, R., Röst, H. L. 2025; 8 (1): 364

    Abstract

    The peptide-centric strategy is widely applied in data-independent acquisition (DIA) proteomics to analyze multiplexed MS2 spectra. However, current software tools often rely on single-run data for peptide peak identification, leading to inconsistent quantification across heterogeneous datasets. Match-between-runs (MBR) algorithms address this by aligning peaks or elution profiles post-analysis, but they are often ad hoc and lack statistical frameworks for controlling peak quality, causing false positives and reduced quantitative reproducibility. Here we present DreamDIAlignR, a cross-run peptide-centric tool that integrates peptide elution behavior across runs with a deep learning peak identifier and alignment algorithm for consistent peak picking and FDR-controlled scoring. DreamDIAlignR outperformed state-of-the-art MBR methods, identifying up to 21.2% more quantitatively changing proteins in a benchmark dataset and 36.6% more in a cancer dataset. Additionally, DreamDIAlignR establishes an improved methodology for performing MBR compatible with existing DIA analysis tools, thereby enhancing the overall quality of DIA analysis.

    View details for DOI 10.1038/s42004-025-01734-5

    View details for PubMedID 41266840

    View details for PubMedCentralID 2818771

  • DIAlignR Provides Precise Retention Time Alignment Across Distant Runs in DIA and Targeted Proteomics MOLECULAR & CELLULAR PROTEOMICS Gupta, S., Ahadi, S., Zhou, W., Rost, H. 2019; 18 (4): 806–17
  • DIAlignR provides precise retention time alignment across distant runs in DIA and targeted proteomics. Molecular & cellular proteomics : MCP Gupta, S., Ahadi, S., Zhou, W., Rost, H. 2019

    Abstract

    SWATH-MS has been widely used for proteomics analysis given its high-throughput and reproducibility but ensuring consistent quantification of analytes across large-scale studies of heterogeneous samples such as human-plasma remains challenging. Heterogeneity in large-scale studies can be caused by large time intervals between data-acquisition, acquisition by different operators or instruments, intermittent repair or replacement of parts, such as the liquid chromatography column, all of which affect retention time (RT) reproducibility and successively performance of SWATH-MS data analysis. Here, we present a novel algorithm for retention time alignment of SWATH-MS data based on direct alignment of raw MS2 chromatograms using a hybrid dynamic programming approach. The algorithm does not impose a chronological order of elution and allows for alignment of elution-order swapped peaks. Furthermore, allowing RT-mapping in a certain window around coarse global fit makes it robust against noise. On a manually validated dataset, this strategy outperforms the current state-of-the-art approaches. In addition, on a real-world clinical data, our approach outperforms global alignment methods by mapping 98% of peaks compared to 67% cumulatively and DIAlignR can reduce alignment error up to 30-fold for extremely distant runs. The robustness of technical parameters used in this pairwise alignment strategy has also been demonstrated. The source code is released under the BSD license at https://github.com/Roestlab/DIAlignR.

    View details for PubMedID 30705124