All Publications


  • The IGVF catalog-from genetic variation to function. Nucleic acids research Li, D., Liu, S., Assis, P. R., Li, M., Dong, S., Whaling, I., Jolanki, O., Kagda, M., Zhang, W., Macias-Velasco, J. F., Liu, T., Cody, S., Antonacci-Fulton, L., Huang, Y., Liu, J., Montgomery, M. T., Zeiberg, D., Jain, S., Pejaver, V., Bergquist, T., Chen, Y., Radivojac, P., Gersbach, C. A., Sherpa, R. N., Castro, C. P., Boyle, A. P., Starita, L. M., Fowler, D. M., Ahituv, N., Dey, K. K., Majoros, W. H., Reddy, T. E., Craven, M., Sinha, R., Sverchkov, Y., Cai, X., Nzima, M. Z., Calderwood, M. A., Rozowsky, J., Gerstein, M., Ma, J., Yue, F., Cherry, J. M., Love, M. I., Engreitz, J. M., Hitz, B. C., Wang, T. 2025

    Abstract

    Genomic variation between individuals is essential for understanding how differences in the genome sequence affect molecular and cellular processes. The Impact of Genomic Variation on Function (IGVF) Consortium aims to uncover the relationships among genomic variation, genome function, and phenotypes by combining experimental techniques, such as single-cell mapping and genomic perturbation assays, with computational approaches such as machine learning-based predictive modeling. The IGVF Data and Administrative Coordinating Centers collect, analyze, and disseminate data and results from across the consortium through an open-source platform called the IGVF Catalog. This resource includes, but is not limited to, data on the effects of coding variants on protein abundance and function, noncoding variants on enhancer activity (measured by MPRA or predicted computationally), and associations between variants and quantitative traits. All data are organized within a graph database comprising over 50 types of data collections with nearly 3 billion nodes and over 7.5 billion edges. The Catalog offers public API endpoints (https://api.catalogkg.igvf.org/) and a user-friendly interface for exploring, querying, and visualizing the data at https://catalog.igvf.org. We expect that this open-access platform will support the broader scientific community to advance our understanding of how genomic variation influences biology and disease.

    View details for DOI 10.1093/nar/gkaf1341

    View details for PubMedID 41359121

  • GREGoR: accelerating genomics for rare diseases. Nature Dawood, M., Heavner, B., Wheeler, M. M., Ungar, R. A., LoTempio, J., Wiel, L., Berger, S., Bernstein, J. A., Chong, J. X., Délot, E. C., Eichler, E. E., Lupski, J. R., Shojaie, A., Talkowski, M. E., Wagner, A. H., Wei, C. L., Wellington, C., Wheeler, M. T., Carvalho, C. M., Gibbs, R. A., Gifford, C. A., May, S., Miller, D. E., Rehm, H. L., Samocha, K. E., Sedlazeck, F. J., Vilain, E., O'Donnell-Luria, A., Posey, J. E., Chadwick, L. H., Bamshad, M. J., Montgomery, S. B. 2025; 647 (8089): 331-342

    Abstract

    Rare diseases are collectively common, affecting approximately 1 in 20 individuals worldwide. In recent years, rapid progress has been made in rare disease diagnostics due to advances in next-generation sequencing, development of new computational and functional genomics approaches to prioritize genes and variants and increased global sharing of clinical and genetic data. However, more than half of individuals suspected to have a rare disease lack a genetic diagnosis. The Genomics Research to Elucidate the Genetics of Rare Diseases (GREGoR) Consortium was initiated to study thousands of challenging rare disease cases and families and apply, standardize and evaluate emerging genomics technologies and analytics to accelerate their adoption in clinical practice. Furthermore, all data generated, currently representing over 7,500 individuals from over 3,000 families, are rapidly made available to researchers worldwide through the Analysis, Visualization and Informatics Lab-space (AnVIL) to catalyse global efforts to develop approaches for genetic diagnoses in rare diseases. Most of these families have undergone previous clinical genetic testing but remained unsolved, with most being exome-negative. Here we describe the collaborative research framework, datasets and discoveries comprising GREGoR that will provide foundational resources and substrates for the future of rare disease genomics.

    View details for DOI 10.1038/s41586-025-09613-8

    View details for PubMedID 41224980

    View details for PubMedCentralID 9119004

  • Deciphering the impact of genomic variation on function. Nature 2024; 633 (8028): 47-57

    Abstract

    Our genomes influence nearly every aspect of human biology-from molecular and cellular functions to phenotypes in health and disease. Studying the differences in DNA sequence between individuals (genomic variation) could reveal previously unknown mechanisms of human biology, uncover the basis of genetic predispositions to diseases, and guide the development of new diagnostic tools and therapeutic agents. Yet, understanding how genomic variation alters genome function to influence phenotype has proved challenging. To unlock these insights, we need a systematic and comprehensive catalogue of genome function and the molecular and cellular effects of genomic variants. Towards this goal, the Impact of Genomic Variation on Function (IGVF) Consortium will combine approaches in single-cell mapping, genomic perturbations and predictive modelling to investigate the relationships among genomic variation, genome function and phenotypes. IGVF will create maps across hundreds of cell types and states describing how coding variants alter protein activity, how noncoding variants change the regulation of gene expression, and how such effects connect through gene-regulatory and protein-interaction networks. These experimental data, computational predictions and accompanying standards and pipelines will be integrated into an open resource that will catalyse community efforts to explore how our genomes influence biology and disease across populations.

    View details for DOI 10.1038/s41586-024-07510-0

    View details for PubMedID 39232149

    View details for PubMedCentralID 7405896

  • Harnessing Artificial Intelligence for Risk Stratification in Acute Myeloid Leukemia (AML): Evaluating the Utility of Longitudinal Electronic Health Record (EHR) Data Via Graph Neural Networks Sinha, R., Schwede, M., Ben Viggiano, Kuo, D., Henry, S., Wood, D., Mannis, G., Majeti, R., Chen, J., Zhang, T. Y. AMER SOC HEMATOLOGY. 2023