Bio
I am a Stanford Data Science Postdoctoral Scholar in the Department of Biology at Stanford University, supervised by Prof. Hunter Fraser. My research focuses on evolutionary dynamics and the development of high-performance computational tools to analyze complex biological systems. I earned my Ph.D. in Bioinformatics from the University of Calgary, Canada, where I investigated within-host evolution in pathogen genomics and cancer. Originally from Sri Lanka, I hold a First Class B.Sc. (Hons) in Biology from the University of Sri Jayewardenepura. I am passionate about advancing computational biology through the design and implementation of scalable software solutions that leverage GPU, CPU, and SSD architectures for large-scale genomic and evolutionary analysis.
Honors & Awards
-
Stanford Data Science Postdoctoral Fellowship, Stanford Data Science (2025-09-01)
-
Stanford Center for Computational, Evolutionary, and Human Genomics Postdoctoral Fellowship, Center for Computational, Evolutionary, and Human Genomics, Stanford University (2025-08-01)
Professional Education
-
Bachelor of Science, University Of Sri Jayewardenepura (2017)
-
Doctor of Philosophy, University of Calgary (2025)
-
Ph.D., University of Calgary, Canada, Bioinformatics (2025)
-
B.Sc., University of Sri Jayewardenepura, Sri Lanka, Biology/Biological Sciences, Honors (2018)
Research Interests
-
Data Sciences
All Publications
-
Apollo: a comprehensive GPU-powered within-host simulator for viral evolution and infection dynamics across population, tissue, and cell
NATURE COMMUNICATIONS
2025; 16 (1): 5783
Abstract
Modern sequencing instruments bring unprecedented opportunity to study within-host viral evolution in conjunction with viral transmissions between hosts. However, no computational simulators are available to assist the characterization of within-host dynamics. This limits our ability to interpret epidemiological predictions incorporating within-host evolution and to validate computational inference tools. To fill this need we developed Apollo, a GPU-accelerated, out-of-core tool for within-host simulation of viral evolution and infection dynamics across population, tissue, and cellular levels. Apollo is scalable to hundreds of millions of viral genomes and can handle complex demographic and population genetic models. Apollo can replicate real within-host viral evolution; accurately recapturing observed viral sequences from HIV and SARS-CoV-2 cohorts derived from initial population-genetic configurations. For practical applications, using Apollo-simulated viral genomes and transmission networks, we validated and uncovered the limitations of a widely used viral transmission inference tool.
View details for DOI 10.1038/s41467-025-60988-8
View details for Web of Science ID 001523451900035
View details for PubMedID 40593638
View details for PubMedCentralID PMC12219717
-
cLD: Rare-variant linkage disequilibrium between genomic regions identifies novel genomic interactions
PLOS GENETICS
2023; 19 (12): e1011074
Abstract
Linkage disequilibrium (LD) is a fundamental concept in genetics; critical for studying genetic associations and molecular evolution. However, LD measurements are only reliable for common genetic variants, leaving low-frequency variants unanalyzed. In this work, we introduce cumulative LD (cLD), a stable statistic that captures the rare-variant LD between genetic regions, which reflects more biological interactions between variants, in addition to lack of recombination. We derived the theoretical variance of cLD using delta methods to demonstrate its higher stability than LD for rare variants. This property is also verified by bootstrapped simulations using real data. In application, we find cLD reveals an increased genetic association between genes in 3D chromatin interactions, a phenomenon recently reported negatively by calculating standard LD between common variants. Additionally, we show that cLD is higher between gene pairs reported in interaction databases, identifies unreported protein-protein interactions, and reveals interacting genes distinguishing case/control samples in association studies.
View details for DOI 10.1371/journal.pgen.1011074
View details for Web of Science ID 001153532200002
View details for PubMedID 38109434
View details for PubMedCentralID PMC10758262
-
CATE: A fast and scalable CUDA implementation to conduct highly parallelized evolutionary tests on large scale genomic data
METHODS IN ECOLOGY AND EVOLUTION
2023; 14 (8): 2095-2109
View details for DOI 10.1111/2041-210X.14168
View details for Web of Science ID 001020064600001
-
Interaction-integrated linear mixed model reveals 3D-genetic basis underlying Autism
GENOMICS
2023; 115 (2): 110575
Abstract
Genetic interactions play critical roles in genotype-phenotype associations. We developed a novel interaction-integrated linear mixed model (ILMM) that integrates a priori knowledge into linear mixed models. ILMM enables statistical integration of genetic interactions upfront and overcomes the problems of searching for combinations. To demonstrate its utility, with 3D genomic interactions (assessed by Hi-C experiments) as a priori, we applied ILMM to whole-genome sequencing data for Autism Spectrum Disorders (ASD) and brain transcriptome data, revealing the 3D-genetic basis of ASD and 3D-expression quantitative loci (3D-eQTLs) for brain tissues. Notably, we reported a potential mechanism involving distal regulation between FOXP2 and DNMT3A, conferring the risk of ASD.
View details for DOI 10.1016/j.ygeno.2023.110575
View details for Web of Science ID 000943685900001
View details for PubMedID 36758877
-
In silico study of SARS-CoV-2 spike protein RBD and human ACE-2 affinity dynamics across variants and Omicron subvariants
JOURNAL OF MEDICAL VIROLOGY
2023; 95 (1): e28406
Abstract
The coronavirus disease 2019 virus outbreak continues worldwide, with many variants emerging, some of which are considered variants of concern (VOCs). The WHO designated Omicron as a VOC and assigned it under variant B.1.1.529. Here, we used computational studies to examine the VOCs, including Omicron subvariants, and one variant of interest. Here we found that the binding affinity of human receptor angiotensin-converting enzyme 2 (hACE2) and receptor-binding domain (RBDs) increased in the order of wild type (Wuhan-strain) < Beta < Alpha < OmicronBA.5 < Gamma < Delta < Omicron BA.2.75 < BA.1 < BA.3 < BA.2. Interactions between docked complexes revealed that the RBD residue positions like 452, 478, 493, 498, 501, and 505 are crucial in creating strong interactions with hACE2. Omicron BA.2 shows the highest binding capacity to the hACE2 receptor among all the mutant complexes. The BA.5's L452R, F486V, and T478K mutation significantly impact the interaction network in the BA.5 RBD-hACE2 interface. Here for the first time, we report the His505, an active residue on the RBD forming a salt bridge in the BA.2, leading to increased mutation stability. When the active RBD residues are mutated, binding affinity and intermolecular interactions increase across all mutant complexes. By examining the differences in different variants, this study may provide a solid foundation for structure-based drug design for newly emerging variants.
View details for DOI 10.1002/jmv.28406
View details for Web of Science ID 000911465200324
View details for PubMedID 36519577
View details for PubMedCentralID PMC9877981
-
Reconstructing SARS-CoV-2 infection dynamics through the phylogenetic inference of unsampled sources of infection
PLOS ONE
2021; 16 (12): e0261422
Abstract
The COVID-19 pandemic has illustrated the importance of infection tracking. The role of asymptomatic, undiagnosed individuals in driving infections within this pandemic has become increasingly evident. Modern phylogenetic tools that take into account asymptomatic or undiagnosed individuals can help guide public health responses. We finetuned established phylogenetic pipelines using published SARS-CoV-2 genomic data to examine reasonable estimate transmission networks with the inference of unsampled infection sources. The system utilised Bayesian phylogenetics and TransPhylo to capture the evolutionary and infection dynamics of SARS-CoV-2. Our analyses gave insight into the transmissions within a population including unsampled sources of infection and the results aligned with epidemiological observations. We were able to observe the effects of preventive measures in Canada's "Atlantic bubble" and in populations such as New York State. The tools also inferred the cross-species disease transmission of SARS-CoV-2 transmission from humans to lions and tigers in New York City's Bronx Zoo. These phylogenetic tools offer a powerful approach in response to both the COVID-19 and other emerging infectious disease outbreaks.
View details for DOI 10.1371/journal.pone.0261422
View details for Web of Science ID 000754828900059
View details for PubMedID 34910769
View details for PubMedCentralID PMC8673622
-
A Novel <i>In Silico</i> Benchmarked Pipeline Capable of Complete Protein Analysis: A Possible Tool for Potential Drug Discovery
BIOLOGY-BASEL
2021; 10 (11)
Abstract
Current in silico proteomics require the trifecta analysis, namely, prediction, validation, and functional assessment of a modeled protein. The main drawback of this endeavor is the lack of a single protocol that utilizes a proper set of benchmarked open-source tools to predict a protein's structure and function accurately. The present study rectifies this drawback through the design and development of such a protocol. The protocol begins with the characterization of a novel coding sequence to identify the expressed protein. It then recognizes and isolates evolutionarily conserved sequence motifs through phylogenetics. The next step is to predict the protein's secondary structure, followed by the prediction, refinement, and validation of its three-dimensional tertiary structure. These steps enable the functional analysis of the macromolecule through protein docking, which facilitates the identification of the protein's active site. Each of these steps is crucial for the complete characterization of the protein under study. We have dubbed this process the trifecta analysis. In this study, we have proven the effectiveness of our protocol using the cystatin C and AChE proteins. Beginning with just their sequences, we have characterized both proteins' structures and functions, including identifying the cystatin C protein's seven-residue active site and the AChE protein's active-site gorge via protein-protein and protein-ligand docking, respectively. This process will greatly benefit new and experienced scientists alike in obtaining a strong understanding of the trifecta analysis, resulting in a domino effect that could expand drug development.
View details for DOI 10.3390/biology10111113
View details for Web of Science ID 000726644200001
View details for PubMedID 34827106
View details for PubMedCentralID PMC8615085
-
Reconstruction of Microbial Haplotypes by Integration of Statistical and Physical Linkage in Scaffolding
MOLECULAR BIOLOGY AND EVOLUTION
2021; 38 (6): 2660-2672
Abstract
DNA sequencing technologies provide unprecedented opportunities to analyze within-host evolution of microorganism populations. Often, within-host populations are analyzed via pooled sequencing of the population, which contains multiple individuals or "haplotypes." However, current next-generation sequencing instruments, in conjunction with single-molecule barcoded linked-reads, cannot distinguish long haplotypes directly. Computational reconstruction of haplotypes from pooled sequencing has been attempted in virology, bacterial genomics, metagenomics, and human genetics, using algorithms based on either cross-host genetic sharing or within-host genomic reads. Here, we describe PoolHapX, a flexible computational approach that integrates information from both genetic sharing and genomic sequencing. We demonstrated that PoolHapX outperforms state-of-the-art tools tailored to specific organismal systems, and is robust to within-host evolution. Importantly, together with barcoded linked-reads, PoolHapX can infer whole-chromosome-scale haplotypes from 50 pools each containing 12 different haplotypes. By analyzing real data, we uncovered dynamic variations in the evolutionary processes of within-patient HIV populations previously unobserved in single position-based analysis.
View details for DOI 10.1093/molbev/msab037
View details for Web of Science ID 000664191500033
View details for PubMedID 33547786
View details for PubMedCentralID PMC8136496
-
Antimicrobial activity of <i>Plumbago indica</i> and ligand screening of plumbagin against methicillin-resistant <i>Staphylococcus aureus</i>
JOURNAL OF BIOMOLECULAR STRUCTURE & DYNAMICS
2022; 40 (7): 3273-3284
Abstract
In this study, the antimicrobial properties of Plumbago indica root bark against bacterial strains and a fungal strain were investigatedusing the disc diffusion and minimum inhibitory concentration assays. Gas chromatography/mass spectrometry, nuclear magnetic resonance spectrometry, and column chromatography analyses were conducted to identify and isolate the active compounds. A docking study was performed to identify possible interactions between the active compound and DNA gyrase using the Schrödinger Glide docking program. Both methanol extract and the ethyl acetate fraction of the root bark showed significant antimicrobial activity against the gram-positive bacteria than against the gram-negative bacteria and the fungal strain. The active compound was identified as plumbagin. A disc diffusion assay of plumbagin revealed potent antimicrobial activity against methicillin-resistant Staphylococcus aureus. Molecular docking of plumbagin revealed high specificity towards the DNA gyrase binding site with a high fitness score and a minimum energy barrier of -7.651 kcal/mol. These findings indicate that P. indica exhibits significant antimicrobial activity, primarily due to the presence of plumbagin. The specificity of plumbagin toward DNA gyrase in S. aureus indicates the feasibility of utilizing P. indica for developing new drug leads against drug resistant microbial strain. Communicated by Ramaswamy H. Sarma.
View details for DOI 10.1080/07391102.2020.1846622
View details for Web of Science ID 000590633300001
View details for PubMedID 33213303
-
Evaluation of A Phylogenetic Pipeline to Examine Transmission Networks in A Canadian HIV Cohort
MICROORGANISMS
2020; 8 (2)
Abstract
Keywords: HIV; Canada; molecular phylogenetics; viral evolution; person-to-person transmission inference; transmission network; summary statistics.
View details for DOI 10.3390/microorganisms8020196
View details for Web of Science ID 000519618200052
View details for PubMedID 32023939
View details for PubMedCentralID PMC7074708
-
1,3-Dinitrobenze-Induced Genotoxicity Through Altering Nuclear Integrity of Diploid and Polyploidy Germ Cells
DOSE-RESPONSE
2019; 17 (3): 1559325819876760
Abstract
1,3-Dinitrobenzene (mDNB) is a widely used intermediate in commercial products and causes testicular injury. However, genotoxic effects upon low-level exposure are poorly understood. The present study evaluated the effects of very low-chronic doses of mDNB on sperm nuclear integrity. Male hamsters were treated with 1.5 mg/kg/d/4 wks (group A), 1.5 mg/kg/mDNB/d/week/4 weeks (group B), 1.0 mg/kg/mDNB/3 d/wk/4 wks (group C), or polyethylene glycol 600 (control). Nuclear integrity of distal cauda epididymal sperm was determined using the sperm chromatin structure assay and acridine orange staining (AOS). The germ cell nuclear integrity was assessed by the comet assay. Testicular histopathology was conducted to evaluate the sensitive stages. The comet assay revealed denatured nuclear DNA in group A (in diploid and polyploid cells from weeks 2-5); respectively at week 4 and weeks 3 to 4 in groups B and C. According to AOS, only group A animals exhibited denatured sperm DNA (weeks 1 and 3). The effective sperm count declined from weeks 1 to 6. Mean sperm DNA denaturation extent, percentage cells outside the main population, and standard deviation indicated altered sperm nuclear integrity in group A. Same animals exhibited progressive disruption of the Sertoli cells, while groups B and C exhibited damages on germ cells. The results suggest that mDNB affects sperm nuclear integrity at very low chronic doses targeting cell-specific testicular damage.
View details for DOI 10.1177/1559325819876760
View details for Web of Science ID 000487502900001
View details for PubMedID 31579111
View details for PubMedCentralID PMC6757507
https://orcid.org/0000-0003-4654-0734