Professional Education

  • Doctor of Philosophy, Chinese Academy Of Sciences (2017)

All Publications

  • TFAP2C- and p63-Dependent Networks Sequentially Rearrange Chromatin Landscapes to Drive Human Epidermal Lineage Commitment. Cell stem cell Li, L., Wang, Y., Torkelson, J. L., Shankar, G., Pattison, J. M., Zhen, H. H., Fang, F., Duren, Z., Xin, J., Gaddam, S., Melo, S. P., Piekos, S. N., Li, J., Liaw, E. J., Chen, L., Li, R., Wernig, M., Wong, W. H., Chang, H. Y., Oro, A. E. 2019


    Tissue development results from lineage-specific transcription factors (TFs) programming a dynamic chromatin landscape through progressive cell fate transitions. Here, we define epigenomic landscape during epidermal differentiation of human pluripotent stem cells (PSCs) and create inference networks that integrate gene expression, chromatin accessibility, and TF binding to define regulatory mechanisms during keratinocyte specification. We found two critical chromatin networks during surface ectoderm initiation and keratinocyte maturation, which are driven by TFAP2C and p63, respectively. Consistently, TFAP2C, but not p63, is sufficient to initiate surface ectoderm differentiation, and TFAP2C-initiated progenitor cells are capable of maturing into functional keratinocytes. Mechanistically, TFAP2C primes the surface ectoderm chromatin landscape and induces p63 expression and binding sites, thus allowing maturation factor p63 to positively autoregulate its own expression and close a subset of the TFAP2C-initiated surface ectoderm program. Our work provides a general framework to infer TF networks controlling chromatin transitions that will facilitate future regenerative medicine advances.

    View details for PubMedID 30686763

  • DC3 is a method for deconvolution and coupled clustering from bulk and single-cell genomics data. Nature communications Zeng, W., Chen, X., Duren, Z., Wang, Y., Jiang, R., Wong, W. H. 2019; 10 (1): 4613


    Characterizing and interpreting heterogeneous mixtures at the cellular level is a critical problem in genomics. Single-cell assays offer an opportunity to resolve cellular level heterogeneity, e.g., scRNA-seq enables single-cell expression profiling, and scATAC-seq identifies active regulatory elements. Furthermore, while scHi-C can measure the chromatin contacts (i.e., loops) between active regulatory elements to target genes in single cells, bulk HiChIP can measure such contacts in a higher resolution. In this work, we introduce DC3 (De-Convolution and Coupled-Clustering) as a method for the joint analysis of various bulk and single-cell data such as HiChIP, RNA-seq and ATAC-seq from the same heterogeneous cell population. DC3 can simultaneously identify distinct subpopulations, assign single cells to the subpopulations (i.e., clustering) and de-convolve the bulk data into subpopulation-specific data. The subpopulation-specific profiles of gene expression, chromatin accessibility and enhancer-promoter contact obtained by DC3 provide a comprehensive characterization of the gene regulatory system in each subpopulation.

    View details for DOI 10.1038/s41467-019-12547-1

    View details for PubMedID 31601804

  • Hierarchical graphical model reveals HFR1 bridging circadian rhythm and flower development in Arabidopsis thaliana. NPJ systems biology and applications Duren, Z., Wang, Y., Wang, J., Zhao, X., Lv, L., Li, X., Liu, J., Zhu, X., Chen, L., Wang, Y. 2019; 5: 28


    To study systems-level properties of the cell, it is necessary to go beyond individual regulators and target genes to study the regulatory network among transcription factors (TFs). However, it is difficult to directly dissect the TFs mediated genome-wide gene regulatory network (GRN) by experiment. Here, we proposed a hierarchical graphical model to estimate TF activity from mRNA expression by building TF complexes with protein cofactors and inferring TF's downstream regulatory network simultaneously. Then we applied our model on flower development and circadian rhythm processes in Arabidopsis thaliana. The computational results show that the sequence specific bHLH family TF HFR1 recruits the chromatin regulator HAC1 to flower development master regulator TF AG and further activates AG's expression by histone acetylation. Both independent data and experimental results supported this discovery. We also found a flower tissue specific H3K27ac ChIP-seq peak at AG gene body and a HFR1 motif in the center of this H3K27ac peak. Furthermore, we verified that HFR1 physically interacts with HAC1 by yeast two-hybrid experiment. This HFR1-HAC1-AG triplet relationship may imply that flower development and circadian rhythm are bridged by epigenetic regulation and enrich the classical ABC model in flower development. In addition, our TF activity network can serve as a general method to elucidate molecular mechanisms on other complex biological regulatory processes.

    View details for DOI 10.1038/s41540-019-0106-3

    View details for PubMedID 31428455

  • Integrative analysis of single-cell genomics data by coupled nonnegative matrix factorizations. Proceedings of the National Academy of Sciences of the United States of America Duren, Z., Chen, X., Zamanighomi, M., Zeng, W., Satpathy, A. T., Chang, H. Y., Wang, Y., Wong, W. H. 2018


    When different types of functional genomics data are generated on single cells from different samples of cells from the same heterogeneous population, the clustering of cells in the different samples should be coupled. We formulate this "coupled clustering" problem as an optimization problem and propose the method of coupled nonnegative matrix factorizations (coupled NMF) for its solution. The method is illustrated by the integrative analysis of single-cell RNA-sequencing (RNA-seq) and single-cell ATAC-sequencing (ATAC-seq) data.

    View details for PubMedID 29987051

  • Unsupervised clustering and epigenetic classification of single cells NATURE COMMUNICATIONS Zamanighomi, M., Lin, Z., Daley, T., Chen, X., Duren, Z., Schep, A., Greenleaf, W. J., Wong, W. 2018; 9: 2410


    Characterizing epigenetic heterogeneity at the cellular level is a critical problem in the modern genomics era. Assays such as single cell ATAC-seq (scATAC-seq) offer an opportunity to interrogate cellular level epigenetic heterogeneity through patterns of variability in open chromatin. However, these assays exhibit technical variability that complicates clear classification and cell type identification in heterogeneous populations. We present scABC, an R package for the unsupervised clustering of single-cell epigenetic data, to classify scATAC-seq data and discover regions of open chromatin specific to cell identity.

    View details for PubMedID 29925875

  • Modeling gene regulation from paired expression and chromatin accessibility data. Proceedings of the National Academy of Sciences of the United States of America Duren, Z., Chen, X., Jiang, R., Wang, Y., Wong, W. H. 2017


    The rapid increase of genome-wide datasets on gene expression, chromatin states, and transcription factor (TF) binding locations offers an exciting opportunity to interpret the information encoded in genomes and epigenomes. This task can be challenging as it requires joint modeling of context-specific activation of cis-regulatory elements (REs) and the effects on transcription of associated regulatory factors. To meet this challenge, we propose a statistical approach based on paired expression and chromatin accessibility (PECA) data across diverse cellular contexts. In our approach, we model (i) the localization to REs of chromatin regulators (CRs) based on their interaction with sequence-specific TFs, (ii) the activation of REs due to CRs that are localized to them, and (iii) the effect of TFs bound to activated REs on the transcription of target genes (TGs). The transcriptional regulatory network inferred by PECA provides a detailed view of how trans- and cis-regulatory elements work together to affect gene expression in a context-specific manner. We illustrate the feasibility of this approach by analyzing paired expression and accessibility data from the mouse Encyclopedia of DNA Elements (ENCODE) and explore various applications of the resulting model.

    View details for DOI 10.1073/pnas.1704553114

    View details for PubMedID 28576882