Bio


I am an Internal Medicine resident at Stanford in the Translational Investigator Program (TIP), with a planned fellowship in Rheumatology.

I worked as a software engineer at Uber before completing my PhD at the University of Cambridge as a Gates Scholar and my MD at the University of Colorado. My research sits at the intersection of computational biology and B cell immunology, using protein language models and structural AI to identify novel therapeutic targets in autoantibody-mediated diseases.

Outside the clinic and lab, I enjoy skiing, hiking, biking, and reading science fiction.

Professional Education


  • MD, University of Colorado-Anschutz School of Medicine, Medicine
  • PhD, University of Cambridge, Biotechnology
  • MPhil, University of Cambridge, Computational Biology
  • BS, University of Colorado-Boulder, Applied Mathematics

All Publications


  • Identifying microbial protease allergens through protein language model-guided homology. Cell systems Thurimella, K., Wu, E., Li, C., Graham, D. B., Owens, R. M., Plichta, D. R., Sokol, C. L., Xavier, R. J., Bacallado, S. 2026; 17 (3): 101510

    Abstract

    Emerging research links the gut, skin, and oral microbiomes to allergies, with serine proteases (SPs) identified as potential allergens. This study leverages deep learning and pre-trained protein language models (pLMs) to uncover allergenic SPs in metagenomic data. First, we develop a model to identify the catalytic serine residue in serine hydrolases, demonstrating how pLMs capture structural information. Next, we create a deep learning framework to detect candidate SP allergens across gene catalogs, using the conserved catalytic triad to identify homologs in gut and oral sites despite low sequence identity. Our model predicts a putative SP allergen resembling V8 protease, a known trigger for protease-activated receptor 1. It also identifies a cysteine protease similar to Der f 1 from dust mites. Immunization with these proteases induced allergic responses, validating their allergenic potential experimentally. This approach uncovers candidate allergens beyond traditional methods, offering new targets for allergy research. A record of this paper's transparent peer review process is included in the supplemental information.

    View details for DOI 10.1016/j.cels.2025.101510

    View details for PubMedID 41722567

    View details for PubMedCentralID PMC13015258

  • Protein language models uncover carbohydrate-active enzyme function in metagenomics. BMC bioinformatics Thurimella, K., Mohamed, A. M., Li, C., Vatanen, T., Graham, D. B., Owens, R. M., La Rosa, S. L., Plichta, D. R., Bacallado, S., Xavier, R. J. 2025; 26 (1): 285

    Abstract

    The functional annotation of uncharacterized microbial enzymes from metagenomic data remains a significant challenge, limiting our understanding of microbial metabolic dynamics. Traditional annotation methods often rely on sequence homology, which can fail to identify remote homologs or enzymes with structural rather than sequence conservation. To address this gap, we developed CAZyLingua, the first annotation tool to use protein language models (pLMs) for the accurate classification of carbohydrate-active enzyme (CAZyme) families and subfamilies.CAZyLingua demonstrated high performance, maintaining precision and recall comparable to state-of-the-art hidden Markov model-based methods while outperforming purely sequence-based approaches. When applied to a metagenomic gene catalog from mother/infant pairs, CAZyLingua identified over 27,000 putative CAZymes missed by other tools, including horizontally-transferred enzymes implicated in infant microbiome development. In datasets from patients with Crohn's disease and IgG4-related disease, CAZyLinuga uncovered disease-associated CAZymes, highlighting an expansion of carbohydrate esterases (CEs) in IgG4-related disease. A CE17 enzyme predicted to be overabundant in Crohn's disease was functionally validated, confirming its catalytic activity on acetylated manno-oligosaccharides.CAZyLingua is a powerful tool that effectively augments existing functional annotation pipelines for CAZymes. By leveraging the deep contextual information captured by pLMs, our method can uncover novel CAZyme diversity and reveal enzymatic functions relevant to health and disease, contributing to a further understanding of biological processes related to host health and nutrition.

    View details for DOI 10.1186/s12859-025-06286-y

    View details for PubMedID 41299229

    View details for PubMedCentralID PMC12659350

  • Bioelectronic Technology for Nutritional Research-a Novel In Vitro Platform for a Better Understanding of Human Gut Barrier Absorption. Advanced biology Stoeger, V., Strauss, M., Thurimella, K., Elias-Kirma, S., Niewczas, I., Parlar, E., Schaudy, E., Moysidou, C. M., Voong, S., Lietard, J., Clark, J., Gerner, C., Owens, R. M. 2026; 10 (2): e00409

    Abstract

    The epithelial gut barrier and gut microbiota significantly contribute to human health by controlling molecule absorption, a regulated transport that dictates bioavailability. Effective public health strategies, like dietary reference values, require a complete understanding of nutrient absorption. However, the lack of internationally harmonized nutritional recommendations indicates that gut barrier mechanisms are not fully unraveled. The conventional in vitro model Caco-2/HT29-MTX cultured on cell culture inserts, established for drug development, is limited in representing complex human gut physiology. The new bioelectronic e-transmembrane platform leverages technological and biological advances to generate more meaningful in vitro predictions. The soft electroactive Poly (3,4-ethylenedioxythiophene) polystyrene sulfonate (PEDOT:PSS) scaffold enables direct cell-electrode coupling for more sensitive barrier impedance measurements, especially required for testing commonly low physiological nutrient concentrations. Promoted epithelial-fibroblast interactions result in modulated protein signal transduction and expression of genes regulating gut barrier integrity. Overall, the e-transmembrane gut barrier more closely mimicked physiological effects for humans as demonstrated using the dietary compound butyrate.

    View details for DOI 10.1002/adbi.202500409

    View details for PubMedID 41705600

  • Human immunodeficiency virus and antiretroviral therapies exert distinct influences across diverse gut microbiomes. Nature microbiology Jabbar, K. S., Priya, S., Xu, J., Das Adhikari, U., Pishchany, G., Mohamed, A. T., Johansen, J., Thurimella, K., McCabe, C., Vlamakis, H., Okello, S., Delorey, T. M., Lankowski, A., Mosepele, M., Siedner, M. J., Plichta, D. R., Kwon, D. S., Xavier, R. J. 2025; 10 (11): 2720-2735

    Abstract

    Human immunodeficiency virus (HIV) infection alters gut microbiota composition and function, but the impact of geography and antiretroviral therapy remains unclear. Here we determined gut microbiome alterations linked to HIV infection and antiretroviral treatment in 327 individuals with HIV and 260 control participants in cohorts from Uganda, Botswana and the USA via faecal metagenomics. We found that while HIV-associated taxonomic differences were mostly site specific, changes in microbial functional pathways were broadly consistent across the cohorts and exacerbated in individuals with acquired immunodeficiency syndrome. Microbiome perturbations associated with antiretroviral medications were also geography dependent. In Botswana and Uganda, use of the non-nucleoside reverse transcriptase inhibitor efavirenz was linked to depletion of Prevotella, disruption of interspecies metabolic networks, exacerbation of systemic inflammation and atherosclerosis. Efavirenz-associated Prevotella depletion may occur through cross-inhibition of prokaryotic reverse transcriptases involved in antiphage defences, as shown by computational and in vitro experiments. These observations could inform future geography-specific and microbiome-guided therapy.

    View details for DOI 10.1038/s41564-025-02157-7

    View details for PubMedID 41168431

    View details for PubMedCentralID 7265289

  • Gut microbiome and metabolome profiling in Framingham heart study reveals cholesterol-metabolizing bacteria. Cell Li, C., Stražar, M., Mohamed, A. M., Pacheco, J. A., Walker, R. L., Lebar, T., Zhao, S., Lockart, J., Dame, A., Thurimella, K., Jeanfavre, S., Brown, E. M., Ang, Q. Y., Berdy, B., Sergio, D., Invernizzi, R., Tinoco, A., Pishchany, G., Vasan, R. S., Balskus, E., Huttenhower, C., Vlamakis, H., Clish, C., Shaw, S. Y., Plichta, D. R., Xavier, R. J. 2024; 187 (8): 1834-1852.e19

    Abstract

    Accumulating evidence suggests that cardiovascular disease (CVD) is associated with an altered gut microbiome. Our understanding of the underlying mechanisms has been hindered by lack of matched multi-omic data with diagnostic biomarkers. To comprehensively profile gut microbiome contributions to CVD, we generated stool metagenomics and metabolomics from 1,429 Framingham Heart Study participants. We identified blood lipids and cardiovascular health measurements associated with microbiome and metabolome composition. Integrated analysis revealed microbial pathways implicated in CVD, including flavonoid, γ-butyrobetaine, and cholesterol metabolism. Species from the Oscillibacter genus were associated with decreased fecal and plasma cholesterol levels. Using functional prediction and in vitro characterization of multiple representative human gut Oscillibacter isolates, we uncovered conserved cholesterol-metabolizing capabilities, including glycosylation and dehydrogenation. These findings suggest that cholesterol metabolism is a broad property of phylogenetically diverse Oscillibacter spp., with potential benefits for lipid homeostasis and cardiovascular health.

    View details for DOI 10.1016/j.cell.2024.03.014

    View details for PubMedID 38569543

    View details for PubMedCentralID PMC11071153

  • SCNIC: Sparse correlation network investigation for compositional data. Molecular ecology resources Shaffer, M., Thurimella, K., Sterrett, J. D., Lozupone, C. A. 2023; 23 (1): 312-325

    Abstract

    Microbiome studies are often limited by a lack of statistical power due to small sample sizes and a large number of features. This problem is exacerbated in correlative studies of multi-omic datasets. Statistical power can be increased by finding and summarizing modules of correlated observations, which is one dimensionality reduction method. Additionally, modules provide biological insight as correlated groups of microbes can have relationships among themselves. To address these challenges, we developed SCNIC: Sparse Cooccurrence Network Investigation for compositional data. SCNIC is open-source software that can generate correlation networks and detect and summarize modules of highly correlated features. Modules can be formed using either the Louvain Modularity Maximization (LMM) algorithm or a Shared Minimum Distance algorithm (SMD) that we newly describe here and relate to LMM using simulated data. We applied SCNIC to two published datasets and we achieved increased statistical power and identified microbes that not only differed across groups, but also correlated strongly with each other, suggesting shared environmental drivers or cooperative relationships among them. SCNIC provides an easy way to generate correlation networks, identify modules of correlated features and summarize them for downstream statistical analysis. Although SCNIC was designed considering properties of microbiome data, such as compositionality and sparsity, it can be applied to a variety of data types including metabolomics data and used to integrate multiple data types. SCNIC allows for the identification of functional microbial relationships at scale while increasing statistical power through feature reduction.

    View details for DOI 10.1111/1755-0998.13704

    View details for PubMedID 36001047

    View details for PubMedCentralID PMC9744196

  • AMON: annotation of metabolite origins via networks to integrate microbiome and metabolome data. BMC bioinformatics Shaffer, M., Thurimella, K., Quinn, K., Doenges, K., Zhang, X., Bokatzian, S., Reisdorph, N., Lozupone, C. A. 2019; 20 (1): 614

    Abstract

    Untargeted metabolomics of host-associated samples has yielded insights into mechanisms by which microbes modulate health. However, data interpretation is challenged by the complexity of origins of the small molecules measured, which can come from the host, microbes that live within the host, or from other exposures such as diet or the environment.We address this challenge through development of AMON: Annotation of Metabolite Origins via Networks. AMON is an open-source bioinformatics application that can be used to annotate which compounds in the metabolome could have been produced by bacteria present or the host, to evaluate pathway enrichment of host verses microbial metabolites, and to visualize which compounds may have been produced by host versus microbial enzymes in KEGG pathway maps.AMON empowers researchers to predict origins of metabolites via genomic information and to visualize potential host:microbe interplay. Additionally, the evaluation of enrichment of pathway metabolites of host versus microbial origin gives insight into the metabolic functionality that a microbial community adds to a host:microbe system. Through integrated analysis of microbiome and metabolome data, mechanistic relationships between microbial communities and host phenotypes can be better understood.

    View details for DOI 10.1186/s12859-019-3176-8

    View details for PubMedID 31779604

    View details for PubMedCentralID PMC6883642