Academic Appointments


All Publications


  • Data integration and inference of gene regulation using single-cell temporal multimodal data with scTIE. Genome research Lin, Y., Wu, T. Y., Chen, X., Wan, S., Chao, B., Xin, J., Yang, J., Wong, W. H., Wang, Y. X. 2023

    Abstract

    Single-cell technologies offer unprecedented opportunities to dissect gene regulatory mechanisms in context-specific ways. Although there are computational methods for extracting gene regulatory relationships from scRNA-seq and scATAC-seq data, the data integration problem, essential for accurate cell type identification, has been mostly treated as a standalone challenge. Here we present scTIE, a unified method that integrates temporal multimodal data and infers regulatory relationships predictive of cellular state changes. scTIE uses an autoencoder to embed cells from all time points into a common space using iterative optimal transport, followed by extracting interpretable information to predict cell trajectories. Using a variety of synthetic and real temporal multimodal datasets, we demonstrate scTIE achieves effective data integration while preserving more biological signals than existing methods, particularly in the presence of batch effects and noise. Furthermore, on the exemplar multiome dataset we generated from differentiating mouse embryonic stem cells over time, we demonstrate scTIE captures regulatory elements highly predictive of cell transition probabilities, providing new avenues to understand the regulatory landscape driving developmental processes.

    View details for DOI 10.1101/gr.277960.123

    View details for PubMedID 38190633

  • scTIE: data integration and inference of gene regulation using single-cell temporal multimodal data. bioRxiv : the preprint server for biology Lin, Y., Wu, T. Y., Chen, X., Wan, S., Chao, B., Xin, J., Yang, J. Y., Wong, W. H., Wang, Y. X. 2023

    Abstract

    Single-cell technologies offer unprecedented opportunities to dissect gene regulatory mechanisms in context-specific ways. Although there are computational methods for extracting gene regulatory relationships from scRNA-seq and scATAC-seq data, the data integration problem, essential for accurate cell type identification, has been mostly treated as a standalone challenge. Here we present scTIE, a unified method that integrates temporal multimodal data and infers regulatory relationships predictive of cellular state changes. scTIE uses an autoencoder to embed cells from all time points into a common space using iterative optimal transport, followed by extracting interpretable information to predict cell trajectories. Using a variety of synthetic and real temporal multimodal datasets, we demonstrate scTIE achieves effective data integration while preserving more biological signals than existing methods, particularly in the presence of batch effects and noise. Furthermore, on the exemplar multiome dataset we generated from differentiating mouse embryonic stem cells over time, we demonstrate scTIE captures regulatory elements highly predictive of cell transition probabilities, providing new potentials to understand the regulatory landscape driving developmental processes.

    View details for DOI 10.1101/2023.05.18.541381

    View details for PubMedID 37292801

    View details for PubMedCentralID PMC10245711

  • Heritability enrichment in context-specific regulatory networks improves phenotype-relevant tissue identification. eLife Feng, Z., Duren, Z., Xin, J., Yuan, Q., He, Y., Su, B., Wong, W. H., Wang, Y. 2022; 11

    Abstract

    Systems genetics holds the promise to decipher complex traits by interpreting their associated SNPs through gene regulatory networks derived from comprehensive multi-omics data of cell types, tissues, and organs. Here, we propose SpecVar to integrate paired chromatin accessibility and gene expression data into context-specific regulatory network atlas and regulatory categories, conduct heritability enrichment analysis with GWAS summary statistics, identify relevant tissues, and depict common genetic factors acting in the shared regulatory networks between traits by relevance correlation. Our method improves power upon existing approaches by associating SNPs with context-specific regulatory elements to assess heritability enrichments and by explicitly prioritizing gene regulations underlying relevant tissues. Ablation studies, independent data validation, and comparison experiments with existing methods on GWAS of six phenotypes show that SpecVar can improve heritability enrichment, accurately detect relevant tissues, and reveal causal regulations. Furthermore, SpecVar correlates the relevance patterns for pairs of phenotypes and better reveals shared SNP associated regulations of phenotypes than existing methods. Studying GWAS of 206 phenotypes in UK-Biobank demonstrates that SpecVar leverages the context-specific regulatory network atlas to prioritize phenotypes' relevant tissues and shared heritability for biological and therapeutic insights. SpecVar provides a powerful way to interpret SNPs via context-specific regulatory networks and is available at https://github.com/AMSSwanglab/SpecVar.

    View details for DOI 10.7554/eLife.82535

    View details for PubMedID 36525361

  • Regulatory analysis of single cell multiome gene expression and chromatin accessibility data with scREG. Genome biology Duren, Z., Chang, F., Naqing, F., Xin, J., Liu, Q., Wong, W. H. 2022; 23 (1): 114

    Abstract

    Technological development has enabled the profiling of gene expression and chromatin accessibility from the same cell. We develop scREG, a dimension reduction methodology, based on the concept of cis-regulatory potential, for single cell multiome data. This concept is further used for the construction of subpopulation-specific cis-regulatory networks. The capability of inferring useful regulatory network is demonstrated by the two-fold increment on network inference accuracy compared to the Pearson correlation-based method and the 27-fold enrichment of GWAS variants for inflammatory bowel disease in the cis-regulatory elements. The R package scREG provides comprehensive functions for single cell multiome data analysis.

    View details for DOI 10.1186/s13059-022-02682-2

    View details for PubMedID 35578363

  • Sc-compReg enables the comparison of gene regulatory networks between conditions using single-cell data. Nature communications Duren, Z., Lu, W. S., Arthur, J. G., Shah, P., Xin, J., Meschi, F., Li, M. L., Nemec, C. M., Yin, Y., Wong, W. H. 2021; 12 (1): 4763

    Abstract

    The comparison of gene regulatory networks between diseased versus healthy individuals or between two different treatments is an important scientific problem. Here, we propose sc-compReg as a method for the comparative analysis of gene expression regulatory networks between two conditions using single cell gene expression (scRNA-seq) and single cell chromatin accessibility data (scATAC-seq). Our software, sc-compReg, can be used as a stand-alone package that provides joint clustering and embedding of the cells from both scRNA-seq and scATAC-seq, and the construction of differential regulatory networks across two conditions. We apply the method to compare the gene regulatory networks of an individual with chronic lymphocytic leukemia (CLL) versus a healthy control. The analysis reveals a tumor-specific B cell subpopulation in the CLL patient and identifies TOX2 as a potential regulator of this subpopulation.

    View details for DOI 10.1038/s41467-021-25089-2

    View details for PubMedID 34362918

  • Reusability report: Compressing regulatory networks to vectors for interpreting gene expression and genetic variants NATURE MACHINE INTELLIGENCE Zeng, W., Xin, J., Jiang, R., Wang, Y. 2021; 3 (7): 576-580
  • Time course regulatory analysis based on paired expression and chromatin accessibility data. Genome research Duren, Z., Chen, X., Xin, J., Wang, Y., Wong, W. 2020

    Abstract

    Time course experiment is a widely used design in the study of cellular processes such as differentiation or response to stimuli. In this paper, we propose TimeReg (Time Course Regulatory Analysis) as a method for the analysis of gene regulatory networks based on paired gene expression and chromatin accessibility data from the time course. TimeReg can be used to prioritize regulatory elements, to extract core regulatory modules at each time point, to identify key regulators driving changes of the cellular state, and to causally connect the modules across different time points. We applied the method to analyze paired chromatin accessibility and gene expression data from retinoic acid (RA) induced mouse embryonic stem cells (mESC) differentiation experiment. The analysis identified 57,048 novel regulatory elements, regulating cerebellar development, synapse assembly and hindbrain morphogenesis, which substantially extended our knowledge of cis-regulatory elements during the differentiation. Using single cell RNA-seq data, we showed that the core regulatory modules can reflect the properties of different subpopulations of cells. Finally, the driver regulators are shown to be important in clarifying the relations between modules across adjacent time points. As a second example, our method on Ascl1 induced direct reprogramming from fibroblast to neuron time-course data identified Id1/2 as driver regulators of early stage of reprogramming.

    View details for DOI 10.1101/gr.257063.119

    View details for PubMedID 32188700

  • ZokorDB: tissue specific regulatory network annotation for non-coding elements of plateau zokor QUANTITATIVE BIOLOGY Xin, J., Hao, J., Chen, L., Zhang, T., Li, L., Chen, L., Zhao, W., Lu, X., Shi, P., Wang, Y. 2020; 8 (1): 43–50
  • Chromatin accessibility landscape and regulatory network of high-altitude hypoxia adaptation. Nature communications Xin, J. n., Zhang, H. n., He, Y. n., Duren, Z. n., Bai, C. n., Chen, L. n., Luo, X. n., Yan, D. S., Zhang, C. n., Zhu, X. n., Yuan, Q. n., Feng, Z. n., Cui, C. n., Qi, X. n., Ouzhuluobu, n. n., Wong, W. H., Wang, Y. n., Su, B. n. 2020; 11 (1): 4928

    Abstract

    High-altitude adaptation of Tibetans represents a remarkable case of natural selection during recent human evolution. Previous genome-wide scans found many non-coding variants under selection, suggesting a pressing need to understand the functional role of non-coding regulatory elements (REs). Here, we generate time courses of paired ATAC-seq and RNA-seq data on cultured HUVECs under hypoxic and normoxic conditions. We further develop a variant interpretation methodology (vPECA) to identify active selected REs (ASREs) and associated regulatory network. We discover three causal SNPs of EPAS1, the key adaptive gene for Tibetans. These SNPs decrease the accessibility of ASREs with weakened binding strength of relevant TFs, and cooperatively down-regulate EPAS1 expression. We further construct the downstream network of EPAS1, elucidating its roles in hypoxic response and angiogenesis. Collectively, we provide a systematic approach to interpret phenotype-associated noncoding variants in proper cell types and relevant dynamic conditions, to model their impact on gene regulation.

    View details for DOI 10.1038/s41467-020-18638-8

    View details for PubMedID 33004791

  • TFAP2C-and p63-Dependent Networks Sequentially Rearrange Chromatin Landscapes to Drive Human Epidermal Lineage Commitment CELL STEM CELL Li, L., Wang, Y., Torkelson, J. L., Shankar, G., Pattison, J. M., Zhen, H. H., Fang, F., Duren, Z., Xin, J., Gaddam, S., Melo, S. P., Piekos, S. N., Li, J., Liaw, E. J., Chen, L., Li, R., Wernig, M., Wong, W. H., Chang, H. Y., Oro, A. E. 2019; 24 (2): 271-+
  • TFAP2C- and p63-Dependent Networks Sequentially Rearrange Chromatin Landscapes to Drive Human Epidermal Lineage Commitment. Cell stem cell Li, L., Wang, Y., Torkelson, J. L., Shankar, G., Pattison, J. M., Zhen, H. H., Fang, F., Duren, Z., Xin, J., Gaddam, S., Melo, S. P., Piekos, S. N., Li, J., Liaw, E. J., Chen, L., Li, R., Wernig, M., Wong, W. H., Chang, H. Y., Oro, A. E. 2019

    Abstract

    Tissue development results from lineage-specific transcription factors (TFs) programming a dynamic chromatin landscape through progressive cell fate transitions. Here, we define epigenomic landscape during epidermal differentiation of human pluripotent stem cells (PSCs) and create inference networks that integrate gene expression, chromatin accessibility, and TF binding to define regulatory mechanisms during keratinocyte specification. We found two critical chromatin networks during surface ectoderm initiation and keratinocyte maturation, which are driven by TFAP2C and p63, respectively. Consistently, TFAP2C, but not p63, is sufficient to initiate surface ectoderm differentiation, and TFAP2C-initiated progenitor cells are capable of maturing into functional keratinocytes. Mechanistically, TFAP2C primes the surface ectoderm chromatin landscape and induces p63 expression and binding sites, thus allowing maturation factor p63 to positively autoregulate its own expression and close a subset of the TFAP2C-initiated surface ectoderm program. Our work provides a general framework to infer TF networks controlling chromatin transitions that will facilitate future regenerative medicine advances.

    View details for PubMedID 30686763