Professional Education

  • Doctor of Philosophy, Chinese Academy Of Sciences (2017)

All Publications

  • Time course regulatory analysis based on paired expression and chromatin accessibility data. Genome research Duren, Z., Chen, X., Xin, J., Wang, Y., Wong, W. 2020


    Time course experiment is a widely used design in the study of cellular processes such as differentiation or response to stimuli. In this paper, we propose TimeReg (Time Course Regulatory Analysis) as a method for the analysis of gene regulatory networks based on paired gene expression and chromatin accessibility data from the time course. TimeReg can be used to prioritize regulatory elements, to extract core regulatory modules at each time point, to identify key regulators driving changes of the cellular state, and to causally connect the modules across different time points. We applied the method to analyze paired chromatin accessibility and gene expression data from retinoic acid (RA) induced mouse embryonic stem cells (mESC) differentiation experiment. The analysis identified 57,048 novel regulatory elements, regulating cerebellar development, synapse assembly and hindbrain morphogenesis, which substantially extended our knowledge of cis-regulatory elements during the differentiation. Using single cell RNA-seq data, we showed that the core regulatory modules can reflect the properties of different subpopulations of cells. Finally, the driver regulators are shown to be important in clarifying the relations between modules across adjacent time points. As a second example, our method on Ascl1 induced direct reprogramming from fibroblast to neuron time-course data identified Id1/2 as driver regulators of early stage of reprogramming.

    View details for DOI 10.1101/gr.257063.119

    View details for PubMedID 32188700

  • Integrated functional genomic analyses of Klinefelter and Turner syndromes reveal global network effects of altered X chromosome dosage. Proceedings of the National Academy of Sciences of the United States of America Zhang, X., Hong, D., Ma, S., Ward, T., Ho, M., Pattni, R., Duren, Z., Stankov, A., Bade Shrestha, S., Hallmayer, J., Wong, W. H., Reiss, A. L., Urban, A. E. 2020


    In both Turner syndrome (TS) and Klinefelter syndrome (KS) copy number aberrations of the X chromosome lead to various developmental symptoms. We report a comparative analysis of TS vs. KS regarding differences at the genomic network level measured in primary samples by analyzing gene expression, DNA methylation, and chromatin conformation. X-chromosome inactivation (XCI) silences transcription from one X chromosome in female mammals, on which most genes are inactive, and some genes escape from XCI. In TS, almost all differentially expressed escape genes are down-regulated but most differentially expressed inactive genes are up-regulated. In KS, differentially expressed escape genes are up-regulated while the majority of inactive genes appear unchanged. Interestingly, 94 differentially expressed genes (DEGs) overlapped between TS and female and KS and male comparisons; and these almost uniformly display expression changes into opposite directions. DEGs on the X chromosome and the autosomes are coexpressed in both syndromes, indicating that there are molecular ripple effects of the changes in X chromosome dosage. Six potential candidate genes (RPS4X, SEPT6, NKRF, CX0rf57, NAA10, and FLNA) for KS are identified on Xq, as well as candidate central genes on Xp for TS. Only promoters of inactive genes are differentially methylated in both syndromes while escape gene promoters remain unchanged. The intrachromosomal contact map of the X chromosome in TS exhibits the structure of an active X chromosome. The discovery of shared DEGs indicates the existence of common molecular mechanisms for gene regulation in TS and KS that transmit the gene dosage changes to the transcriptome.

    View details for DOI 10.1073/pnas.1910003117

    View details for PubMedID 32071206

  • TFAP2C- and p63-Dependent Networks Sequentially Rearrange Chromatin Landscapes to Drive Human Epidermal Lineage Commitment. Cell stem cell Li, L., Wang, Y., Torkelson, J. L., Shankar, G., Pattison, J. M., Zhen, H. H., Fang, F., Duren, Z., Xin, J., Gaddam, S., Melo, S. P., Piekos, S. N., Li, J., Liaw, E. J., Chen, L., Li, R., Wernig, M., Wong, W. H., Chang, H. Y., Oro, A. E. 2019


    Tissue development results from lineage-specific transcription factors (TFs) programming a dynamic chromatin landscape through progressive cell fate transitions. Here, we define epigenomic landscape during epidermal differentiation of human pluripotent stem cells (PSCs) and create inference networks that integrate gene expression, chromatin accessibility, and TF binding to define regulatory mechanisms during keratinocyte specification. We found two critical chromatin networks during surface ectoderm initiation and keratinocyte maturation, which are driven by TFAP2C and p63, respectively. Consistently, TFAP2C, but not p63, is sufficient to initiate surface ectoderm differentiation, and TFAP2C-initiated progenitor cells are capable of maturing into functional keratinocytes. Mechanistically, TFAP2C primes the surface ectoderm chromatin landscape and induces p63 expression and binding sites, thus allowing maturation factor p63 to positively autoregulate its own expression and close a subset of the TFAP2C-initiated surface ectoderm program. Our work provides a general framework to infer TF networks controlling chromatin transitions that will facilitate future regenerative medicine advances.

    View details for PubMedID 30686763

  • DC3 is a method for deconvolution and coupled clustering from bulk and single-cell genomics data. Nature communications Zeng, W., Chen, X., Duren, Z., Wang, Y., Jiang, R., Wong, W. H. 2019; 10 (1): 4613


    Characterizing and interpreting heterogeneous mixtures at the cellular level is a critical problem in genomics. Single-cell assays offer an opportunity to resolve cellular level heterogeneity, e.g., scRNA-seq enables single-cell expression profiling, and scATAC-seq identifies active regulatory elements. Furthermore, while scHi-C can measure the chromatin contacts (i.e., loops) between active regulatory elements to target genes in single cells, bulk HiChIP can measure such contacts in a higher resolution. In this work, we introduce DC3 (De-Convolution and Coupled-Clustering) as a method for the joint analysis of various bulk and single-cell data such as HiChIP, RNA-seq and ATAC-seq from the same heterogeneous cell population. DC3 can simultaneously identify distinct subpopulations, assign single cells to the subpopulations (i.e., clustering) and de-convolve the bulk data into subpopulation-specific data. The subpopulation-specific profiles of gene expression, chromatin accessibility and enhancer-promoter contact obtained by DC3 provide a comprehensive characterization of the gene regulatory system in each subpopulation.

    View details for DOI 10.1038/s41467-019-12547-1

    View details for PubMedID 31601804

  • Hierarchical graphical model reveals HFR1 bridging circadian rhythm and flower development in Arabidopsis thaliana. NPJ systems biology and applications Duren, Z., Wang, Y., Wang, J., Zhao, X., Lv, L., Li, X., Liu, J., Zhu, X., Chen, L., Wang, Y. 2019; 5: 28


    To study systems-level properties of the cell, it is necessary to go beyond individual regulators and target genes to study the regulatory network among transcription factors (TFs). However, it is difficult to directly dissect the TFs mediated genome-wide gene regulatory network (GRN) by experiment. Here, we proposed a hierarchical graphical model to estimate TF activity from mRNA expression by building TF complexes with protein cofactors and inferring TF's downstream regulatory network simultaneously. Then we applied our model on flower development and circadian rhythm processes in Arabidopsis thaliana. The computational results show that the sequence specific bHLH family TF HFR1 recruits the chromatin regulator HAC1 to flower development master regulator TF AG and further activates AG's expression by histone acetylation. Both independent data and experimental results supported this discovery. We also found a flower tissue specific H3K27ac ChIP-seq peak at AG gene body and a HFR1 motif in the center of this H3K27ac peak. Furthermore, we verified that HFR1 physically interacts with HAC1 by yeast two-hybrid experiment. This HFR1-HAC1-AG triplet relationship may imply that flower development and circadian rhythm are bridged by epigenetic regulation and enrich the classical ABC model in flower development. In addition, our TF activity network can serve as a general method to elucidate molecular mechanisms on other complex biological regulatory processes.

    View details for DOI 10.1038/s41540-019-0106-3

    View details for PubMedID 31428455

  • Integrative analysis of single-cell genomics data by coupled nonnegative matrix factorizations. Proceedings of the National Academy of Sciences of the United States of America Duren, Z., Chen, X., Zamanighomi, M., Zeng, W., Satpathy, A. T., Chang, H. Y., Wang, Y., Wong, W. H. 2018


    When different types of functional genomics data are generated on single cells from different samples of cells from the same heterogeneous population, the clustering of cells in the different samples should be coupled. We formulate this "coupled clustering" problem as an optimization problem and propose the method of coupled nonnegative matrix factorizations (coupled NMF) for its solution. The method is illustrated by the integrative analysis of single-cell RNA-sequencing (RNA-seq) and single-cell ATAC-sequencing (ATAC-seq) data.

    View details for PubMedID 29987051

  • Unsupervised clustering and epigenetic classification of single cells NATURE COMMUNICATIONS Zamanighomi, M., Lin, Z., Daley, T., Chen, X., Duren, Z., Schep, A., Greenleaf, W. J., Wong, W. 2018; 9: 2410


    Characterizing epigenetic heterogeneity at the cellular level is a critical problem in the modern genomics era. Assays such as single cell ATAC-seq (scATAC-seq) offer an opportunity to interrogate cellular level epigenetic heterogeneity through patterns of variability in open chromatin. However, these assays exhibit technical variability that complicates clear classification and cell type identification in heterogeneous populations. We present scABC, an R package for the unsupervised clustering of single-cell epigenetic data, to classify scATAC-seq data and discover regions of open chromatin specific to cell identity.

    View details for PubMedID 29925875

  • Modeling gene regulation from paired expression and chromatin accessibility data. Proceedings of the National Academy of Sciences of the United States of America Duren, Z., Chen, X., Jiang, R., Wang, Y., Wong, W. H. 2017


    The rapid increase of genome-wide datasets on gene expression, chromatin states, and transcription factor (TF) binding locations offers an exciting opportunity to interpret the information encoded in genomes and epigenomes. This task can be challenging as it requires joint modeling of context-specific activation of cis-regulatory elements (REs) and the effects on transcription of associated regulatory factors. To meet this challenge, we propose a statistical approach based on paired expression and chromatin accessibility (PECA) data across diverse cellular contexts. In our approach, we model (i) the localization to REs of chromatin regulators (CRs) based on their interaction with sequence-specific TFs, (ii) the activation of REs due to CRs that are localized to them, and (iii) the effect of TFs bound to activated REs on the transcription of target genes (TGs). The transcriptional regulatory network inferred by PECA provides a detailed view of how trans- and cis-regulatory elements work together to affect gene expression in a context-specific manner. We illustrate the feasibility of this approach by analyzing paired expression and accessibility data from the mouse Encyclopedia of DNA Elements (ENCODE) and explore various applications of the resulting model.

    View details for DOI 10.1073/pnas.1704553114

    View details for PubMedID 28576882