Basic Life Science Research Associate, Statistics
Regulatory analysis of single cell multiome gene expression and chromatin accessibility data with scREG.
2022; 23 (1): 114
Technological development has enabled the profiling of gene expression and chromatin accessibility from the same cell. We develop scREG, a dimension reduction methodology, based on the concept of cis-regulatory potential, for single cell multiome data. This concept is further used for the construction of subpopulation-specific cis-regulatory networks. The capability of inferring useful regulatory network is demonstrated by the two-fold increment on network inference accuracy compared to the Pearson correlation-based method and the 27-fold enrichment of GWAS variants for inflammatory bowel disease in the cis-regulatory elements. The R package scREG provides comprehensive functions for single cell multiome data analysis.
View details for DOI 10.1186/s13059-022-02682-2
View details for PubMedID 35578363
Sc-compReg enables the comparison of gene regulatory networks between conditions using single-cell data.
2021; 12 (1): 4763
The comparison of gene regulatory networks between diseased versus healthy individuals or between two different treatments is an important scientific problem. Here, we propose sc-compReg as a method for the comparative analysis of gene expression regulatory networks between two conditions using single cell gene expression (scRNA-seq) and single cell chromatin accessibility data (scATAC-seq). Our software, sc-compReg, can be used as a stand-alone package that provides joint clustering and embedding of the cells from both scRNA-seq and scATAC-seq, and the construction of differential regulatory networks across two conditions. We apply the method to compare the gene regulatory networks of an individual with chronic lymphocytic leukemia (CLL) versus a healthy control. The analysis reveals a tumor-specific B cell subpopulation in the CLL patient and identifies TOX2 as a potential regulator of this subpopulation.
View details for DOI 10.1038/s41467-021-25089-2
View details for PubMedID 34362918
- Reusability report: Compressing regulatory networks to vectors for interpreting gene expression and genetic variants NATURE MACHINE INTELLIGENCE 2021; 3 (7): 576-580
Time course regulatory analysis based on paired expression and chromatin accessibility data.
Time course experiment is a widely used design in the study of cellular processes such as differentiation or response to stimuli. In this paper, we propose TimeReg (Time Course Regulatory Analysis) as a method for the analysis of gene regulatory networks based on paired gene expression and chromatin accessibility data from the time course. TimeReg can be used to prioritize regulatory elements, to extract core regulatory modules at each time point, to identify key regulators driving changes of the cellular state, and to causally connect the modules across different time points. We applied the method to analyze paired chromatin accessibility and gene expression data from retinoic acid (RA) induced mouse embryonic stem cells (mESC) differentiation experiment. The analysis identified 57,048 novel regulatory elements, regulating cerebellar development, synapse assembly and hindbrain morphogenesis, which substantially extended our knowledge of cis-regulatory elements during the differentiation. Using single cell RNA-seq data, we showed that the core regulatory modules can reflect the properties of different subpopulations of cells. Finally, the driver regulators are shown to be important in clarifying the relations between modules across adjacent time points. As a second example, our method on Ascl1 induced direct reprogramming from fibroblast to neuron time-course data identified Id1/2 as driver regulators of early stage of reprogramming.
View details for DOI 10.1101/gr.257063.119
View details for PubMedID 32188700
- ZokorDB: tissue specific regulatory network annotation for non-coding elements of plateau zokor QUANTITATIVE BIOLOGY 2020; 8 (1): 43–50
Chromatin accessibility landscape and regulatory network of high-altitude hypoxia adaptation.
2020; 11 (1): 4928
High-altitude adaptation of Tibetans represents a remarkable case of natural selection during recent human evolution. Previous genome-wide scans found many non-coding variants under selection, suggesting a pressing need to understand the functional role of non-coding regulatory elements (REs). Here, we generate time courses of paired ATAC-seq and RNA-seq data on cultured HUVECs under hypoxic and normoxic conditions. We further develop a variant interpretation methodology (vPECA) to identify active selected REs (ASREs) and associated regulatory network. We discover three causal SNPs of EPAS1, the key adaptive gene for Tibetans. These SNPs decrease the accessibility of ASREs with weakened binding strength of relevant TFs, and cooperatively down-regulate EPAS1 expression. We further construct the downstream network of EPAS1, elucidating its roles in hypoxic response and angiogenesis. Collectively, we provide a systematic approach to interpret phenotype-associated noncoding variants in proper cell types and relevant dynamic conditions, to model their impact on gene regulation.
View details for DOI 10.1038/s41467-020-18638-8
View details for PubMedID 33004791
- TFAP2C-and p63-Dependent Networks Sequentially Rearrange Chromatin Landscapes to Drive Human Epidermal Lineage Commitment CELL STEM CELL 2019; 24 (2): 271-+
TFAP2C- and p63-Dependent Networks Sequentially Rearrange Chromatin Landscapes to Drive Human Epidermal Lineage Commitment.
Cell stem cell
Tissue development results from lineage-specific transcription factors (TFs) programming a dynamic chromatin landscape through progressive cell fate transitions. Here, we define epigenomic landscape during epidermal differentiation of human pluripotent stem cells (PSCs) and create inference networks that integrate gene expression, chromatin accessibility, and TF binding to define regulatory mechanisms during keratinocyte specification. We found two critical chromatin networks during surface ectoderm initiation and keratinocyte maturation, which are driven by TFAP2C and p63, respectively. Consistently, TFAP2C, but not p63, is sufficient to initiate surface ectoderm differentiation, and TFAP2C-initiated progenitor cells are capable of maturing into functional keratinocytes. Mechanistically, TFAP2C primes the surface ectoderm chromatin landscape and induces p63 expression and binding sites, thus allowing maturation factor p63 to positively autoregulate its own expression and close a subset of the TFAP2C-initiated surface ectoderm program. Our work provides a general framework to infer TF networks controlling chromatin transitions that will facilitate future regenerative medicine advances.
View details for PubMedID 30686763