I develop computational methods to discover insights at various resolutions of the biological hierarchy i.e., molecules, cells, tissues, organs, organism and population-scale.
In my current role, I lead the bioinformatics efforts for two groups that are pioneers in epigenomics research.
We study tissue development, cancer evolution, and autoimmunity using multiomics, with a focus on the non-coding genome.
I have over four years of professional experience in industry and academia, and over eight years of academic research experience.
I received my Ph.D. in Data Science specializing in machine learning development for diverse biomedical problems.
My research interests include artificial intelligence to identify novel patterns in multiomics data, psychometrics and neuroimaging data; structural bioinformatics and computational epidemiology. Additionally, I have conducted research in anomaly detection, and community detection in biological networks.
I am the co-inventor of a patented anomaly detection method for real-time streaming data.
Honors & Awards
Academic Excellence Award, Data Science Program, Worcester Polytechnic Institute (2020)
2nd Place Award in CS, DS and Cybersecurity, Graduate Research and Innovation Exchange, Worcester Polytechnic Institute (2019)
1st Place Award in CS, DS and Cybersecurity, Graduate Research Innovation Exchange, Worcester Polytechnic Institute (2018)
Conference Scholarship, Gordon Research Conferences (2018)
Graduate Student Travel Scholarship, Worcester Polytechnic Institute (2018)
Exceeded Expectations Commendation, Oracle (2013-15)
Graduate Merit Scholarship, Birla Institute of Technology and Science, Pilani (2011-13)
Education & Certifications
Ph.D., Data Science, Worcester Polytechnic Institute, MA, USA (2022)
M.E., Software Systems, Birla Institute of Technology and Science, Pilani, India (2013)
B.E., Computer Science, Visvesvaraya Technological University, Bangalore, India (2010)
Lou Zhang, Suhas Srinivasan. "United States Patent 11,181,899 System and method for monitoring machine anomalies via control data", Machinemetrics Inc, Nov 23, 2021
Research Assistant, Worcester Polytechnic Institute (2016 - 2022)
Worcester, MA, USA
Teaching Assistant, Worcester Polytechnic Institute (2015 - 2018)
Worcester, MA, USA
Data Scientist Intern, MachineMetrics (2018 - 2018)
Northampton, MA, USA
Member Technical Staff, Oracle (2013 - 2015)
Bangalore, Karnataka, India
Software Engineer Intern, Dell EMC (2013 - 2013)
Bangalore, Karnataka, India
Teaching Assistant, Birla Institute of Technology and Science (2011 - 2012)
Zuarinagar, Goa, India
Software Quality Engineer, SAP Labs (2011 - 2011)
Bengaluru, Karnataka, India
Professional Affiliations and Activities
Member, International Society for Computational Biology (2022 - Present)
- Unravelling psychiatric heterogeneity and predicting suicide attempts in women with trauma-related dissociation using artificial intelligence EUROPEAN JOURNAL OF PSYCHOTRAUMATOLOGY 2022; 13 (2)
- Computational protein modeling and the next viral pandemic NATURE METHODS 2021; 18 (5): 439-440
- Structural Genomics and Interactomics of SARS-COV2: Decoding Basic Building Blocks of the Coronavirus Virus Bioinformatics Chapman and Hall/CRC. 2021; 1: 121-139
A hybrid deep clustering approach for robust cell type profiling using single-cell RNA-seq data
2020; 26 (10): 1303-1319
Single-cell RNA sequencing (scRNA-seq) is a recent technology that enables fine-grained discovery of cellular subtypes and specific cell states. Analysis of scRNA-seq data routinely involves machine learning methods, such as feature learning, clustering, and classification, to assist in uncovering novel information from scRNA-seq data. However, current methods are not well suited to deal with the substantial amount of noise that is created by the experiments or the variation that occurs due to differences in the cells of the same type. To address this, we developed a new hybrid approach, deep unsupervised single-cell clustering (DUSC), which integrates feature generation based on a deep learning architecture by using a new technique to estimate the number of latent features, with a model-based clustering algorithm, to find a compact and informative representation of the single-cell transcriptomic data generating robust clusters. We also include a technique to estimate an efficient number of latent features in the deep learning model. Our method outperforms both classical and state-of-the-art feature learning and clustering methods, approaching the accuracy of supervised learning. We applied DUSC to a single-cell transcriptomics data set obtained from a triple-negative breast cancer tumor to identify potential cancer subclones accentuated by copy-number variation and investigate the role of clonal heterogeneity. Our method is freely available to the community and will hopefully facilitate our understanding of the cellular atlas of living organisms as well as provide the means to improve patient diagnostics and treatment.
View details for DOI 10.1261/rna.074427.119
View details for Web of Science ID 000570798100002
View details for PubMedID 32532794
View details for PubMedCentralID PMC7491323
Structural Genomics of SARS-CoV-2 Indicates Evolutionary Conserved Functional Regions of Viral Proteins
2020; 12 (4)
During its first two and a half months, the recently emerged 2019 novel coronavirus, SARS-CoV-2, has already infected over one-hundred thousand people worldwide and has taken more than four thousand lives. However, the swiftly spreading virus also caused an unprecedentedly rapid response from the research community facing the unknown health challenge of potentially enormous proportions. Unfortunately, the experimental research to understand the molecular mechanisms behind the viral infection and to design a vaccine or antivirals is costly and takes months to develop. To expedite the advancement of our knowledge, we leveraged data about the related coronaviruses that is readily available in public databases and integrated these data into a single computational pipeline. As a result, we provide comprehensive structural genomics and interactomics roadmaps of SARS-CoV-2 and use this information to infer the possible functional differences and similarities with the related SARS coronavirus. All data are made publicly available to the research community.
View details for DOI 10.3390/v12040360
View details for Web of Science ID 000539525300002
View details for PubMedID 32218151
View details for PubMedCentralID PMC7232164
Enriching Human Interactome with Functional Mutations to Detect High-Impact Network Modules Underlying Complex Diseases
2019; 10 (11)
Rapid progress in high-throughput -omics technologies moves us one step closer to the datacalypse in life sciences. In spite of the already generated volumes of data, our knowledge of the molecular mechanisms underlying complex genetic diseases remains limited. Increasing evidence shows that biological networks are essential, albeit not sufficient, for the better understanding of these mechanisms. The identification of disease-specific functional modules in the human interactome can provide a more focused insight into the mechanistic nature of the disease. However, carving a disease network module from the whole interactome is a difficult task. In this paper, we propose a computational framework, Discovering most IMpacted SUbnetworks in interactoMe (DIMSUM), which enables the integration of genome-wide association studies (GWAS) and functional effects of mutations into the protein-protein interaction (PPI) network to improve disease module detection. Specifically, our approach incorporates and propagates the functional impact of non-synonymous single nucleotide polymorphisms (nsSNPs) on PPIs to implicate the genes that are most likely influenced by the disruptive mutations, and to identify the module with the greatest functional impact. Comparison against state-of-the-art seed-based module detection methods shows that our approach could yield modules that are biologically more relevant and have stronger association with the studied disease. We expect for our method to become a part of the common toolbox for the disease module analysis, facilitating the discovery of new disease markers.
View details for DOI 10.3390/genes10110933
View details for Web of Science ID 000502296000099
View details for PubMedID 31731769
View details for PubMedCentralID PMC6895925
Assessment of network module identification across complex diseases
2019; 16 (9): 843-+
Many bioinformatics methods have been proposed for reducing the complexity of large gene or protein networks into relevant subnetworks or modules. Yet, how such methods compare to each other in terms of their ability to identify disease-relevant modules in different types of network remains poorly understood. We launched the 'Disease Module Identification DREAM Challenge', an open competition to comprehensively assess module identification methods across diverse protein-protein interaction, signaling, gene co-expression, homology and cancer-gene networks. Predicted network modules were tested for association with complex traits and diseases using a unique collection of 180 genome-wide association studies. Our robust assessment of 75 module identification methods reveals top-performing algorithms, which recover complementary trait-associated modules. We find that most of these modules correspond to core disease-relevant pathways, which often comprise therapeutic targets. This community challenge establishes biologically interpretable benchmarks, tools and guidelines for molecular network analysis to study human disease biology.
View details for DOI 10.1038/s41592-019-0509-5
View details for Web of Science ID 000484044700022
View details for PubMedID 31471613
View details for PubMedCentralID PMC6719725