Academic Appointments

All Publications

  • Integrated functional genomic analyses of Klinefelter and Turner syndromes reveal global network effects of altered X chromosome dosage. Proceedings of the National Academy of Sciences of the United States of America Zhang, X., Hong, D., Ma, S., Ward, T., Ho, M., Pattni, R., Duren, Z., Stankov, A., Bade Shrestha, S., Hallmayer, J., Wong, W. H., Reiss, A. L., Urban, A. E. 2020


    In both Turner syndrome (TS) and Klinefelter syndrome (KS) copy number aberrations of the X chromosome lead to various developmental symptoms. We report a comparative analysis of TS vs. KS regarding differences at the genomic network level measured in primary samples by analyzing gene expression, DNA methylation, and chromatin conformation. X-chromosome inactivation (XCI) silences transcription from one X chromosome in female mammals, on which most genes are inactive, and some genes escape from XCI. In TS, almost all differentially expressed escape genes are down-regulated but most differentially expressed inactive genes are up-regulated. In KS, differentially expressed escape genes are up-regulated while the majority of inactive genes appear unchanged. Interestingly, 94 differentially expressed genes (DEGs) overlapped between TS and female and KS and male comparisons; and these almost uniformly display expression changes into opposite directions. DEGs on the X chromosome and the autosomes are coexpressed in both syndromes, indicating that there are molecular ripple effects of the changes in X chromosome dosage. Six potential candidate genes (RPS4X, SEPT6, NKRF, CX0rf57, NAA10, and FLNA) for KS are identified on Xq, as well as candidate central genes on Xp for TS. Only promoters of inactive genes are differentially methylated in both syndromes while escape gene promoters remain unchanged. The intrachromosomal contact map of the X chromosome in TS exhibits the structure of an active X chromosome. The discovery of shared DEGs indicates the existence of common molecular mechanisms for gene regulation in TS and KS that transmit the gene dosage changes to the transcriptome.

    View details for DOI 10.1073/pnas.1910003117

    View details for PubMedID 32071206

  • Model-Based Approach to the Joint Analysis of Single-Cell Data on Chromatin Accessibility and Gene Expression STATISTICAL SCIENCE Lin, Z., Zamanighomi, M., Daley, T., Ma, S., Wong, W. 2020; 35 (1): 2–13

    View details for DOI 10.1214/19-STS714

    View details for Web of Science ID 000518465700002

  • Network Effects of the 15q13.3 Microdeletion on the Transcriptome and Epigenome in Human-Induced Neurons. Biological psychiatry Zhang, S., Zhang, X., Purmann, C., Ma, S., Shrestha, A., Davis, K. N., Ho, M., Huang, Y., Pattni, R., Wong, W. H., Bernstein, J. A., Hallmayer, J., Urban, A. E. 2020


    The 15q13.3 microdeletion is associated with several neuropsychiatric disorders, including autism and schizophrenia. Previous association and functional studies have investigated the potential role of several genes within the deletion in neuronal dysfunction, but the molecular effects of the deletion as a whole remain largely unknown.Induced pluripotent stem cells, from 3 patients with the 15q13.3 microdeletion and 3 control subjects, were generated and converted into induced neurons. We analyzed the effects of the 15q13.3 microdeletion on genome-wide gene expression, DNA methylation, chromatin accessibility, and sensitivity to cisplatin-induced DNA damage. Furthermore, we measured gene expression changes in induced neurons with CRISPR (clustered regularly interspaced short palindromic repeats) knockouts of individual 15q13.3 microdeletion genes.In both induced pluripotent stem cells and induced neurons, gene copy number change within the 15q13.3 microdeletion was accompanied by significantly decreased gene expression and no compensatory changes in DNA methylation or chromatin accessibility, supporting the model that haploinsufficiency of genes within the deleted region drives the disorder. Furthermore, we observed global effects of the microdeletion on the transcriptome and epigenome, with disruptions in several neuropsychiatric disorder-associated pathways and gene families, including Wnt signaling, ribosome function, DNA binding, and clustered protocadherins. Individual gene knockouts mirrored many of the observed changes in an overlapping fashion between knockouts.Our multiomics analysis of the 15q13.3 microdeletion revealed downstream effects in pathways previously associated with neuropsychiatric disorders and indications of interactions between genes within the deletion. This molecular systems analysis can be applied to other chromosomal aberrations to further our etiological understanding of neuropsychiatric disorders.

    View details for DOI 10.1016/j.biopsych.2020.06.021

    View details for PubMedID 32919612

  • Constructing tissue-specific transcriptional regulatory networks via a Markov random field. BMC genomics Ma, S., Jiang, T., Jiang, R. 2018; 19 (Suppl 10): 884


    BACKGROUND: Recent advances in sequencing technologies have enabled parallel assays of chromatin accessibility and gene expression for major human cell lines. Such innovation provides a great opportunity to decode phenotypic consequences of genetic variation via the construction of predictive gene regulatory network models. However, there still lacks a computational method to systematically integrate chromatin accessibility information with gene expression data to recover complicated regulatory relationships between genes in a tissue-specific manner.RESULTS: We propose a Markov random field (MRF) model for constructing tissue-specific transcriptional regulatory networks via integrative analysis of DNase-seq and RNA-seq data. Our method, named CSNets (cell-line specific regulatory networks), first infers regulatory networks for individual cell lines using chromatin accessibility information, and then fine-tunes these networks using the MRF based on pairwise similarity between cell lines derived from gene expression data. Using this method, we constructed regulatory networks specific to 110 human cell lines and 13 major tissues with the use of ENCODE data. We demonstrated the high quality of these networks via comprehensive statistical analysis based on ChIP-seq profiles, functional annotations, taxonomic analysis, and literature surveys. We further applied these networks to analyze GWAS data of Crohn's disease and prostate cancer. Results were either consistent with the literature or provided biological insights into regulatory mechanisms of these two complex diseases. The website of CSNets is freely available at .CONCLUSIONS: CSNets demonstrated the power of joint analysis on epigenomic and transcriptomic data towards the accurate construction of gene regulatory network. Our work provides not only a useful resource of regulatory networks to the community, but also valuable experiences in methodology development for multi-omics data integration.

    View details for PubMedID 30598101

  • FreePSI: an alignment-free approach to estimating exon-inclusion ratios without a reference transcriptome NUCLEIC ACIDS RESEARCH Zhou, J., Ma, S., Wang, D., Zeng, J., Jiang, T. 2018; 46 (2): e11


    Alternative splicing plays an important role in many cellular processes of eukaryotic organisms. The exon-inclusion ratio, also known as percent spliced in, is often regarded as one of the most effective measures of alternative splicing events. The existing methods for estimating exon-inclusion ratios at the genome scale all require the existence of a reference transcriptome. In this paper, we propose an alignment-free method, FreePSI, to perform genome-wide estimation of exon-inclusion ratios from RNA-Seq data without relying on the guidance of a reference transcriptome. It uses a novel probabilistic generative model based on k-mer profiles to quantify the exon-inclusion ratios at the genome scale and an efficient expectation-maximization algorithm based on a divide-and-conquer strategy and ultrafast conjugate gradient projection descent method to solve the model. We compare FreePSI with the existing methods on simulated and real RNA-seq data in terms of both accuracy and efficiency and show that it is able to achieve very good performance even though a reference transcriptome is not provided. Our results suggest that FreePSI may have important applications in performing alternative splicing analysis for organisms that do not have quality reference transcriptomes. FreePSI is implemented in C++ and freely available to the public on GitHub.

    View details for PubMedID 29136203

    View details for PubMedCentralID PMC5778508

  • Simultaneous inference of phenotype-associated genes and relevant tissues from GWAS data via Bayesian integration of multiple tissue-specific gene networks JOURNAL OF MOLECULAR CELL BIOLOGY Wu, M., Lin, Z., Ma, S., Chen, T., Jiang, R., Wong, W. 2017; 9 (6): 436–52


    Although genome-wide association studies (GWAS) have successfully identified thousands of genomic loci associated with hundreds of complex traits in the past decade, the debate about such problems as missing heritability and weak interpretability has been appealing for effective computational methods to facilitate the advanced analysis of the vast volume of existing and anticipated genetic data. Towards this goal, gene-level integrative GWAS analysis with the assumption that genes associated with a phenotype tend to be enriched in biological gene sets or gene networks has recently attracted much attention, due to such advantages as straightforward interpretation, less multiple testing burdens, and robustness across studies. However, existing methods in this category usually exploit non-tissue-specific gene networks and thus lack the ability to utilize informative tissue-specific characteristics. To overcome this limitation, we proposed a Bayesian approach called SIGNET (Simultaneously Inference of GeNEs and Tissues) to integrate GWAS data and multiple tissue-specific gene networks for the simultaneous inference of phenotype-associated genes and relevant tissues. Through extensive simulation studies, we showed the effectiveness of our method in finding both associated genes and relevant tissues for a phenotype. In applications to real GWAS data of 14 complex phenotypes, we demonstrated the power of our method in both deciphering genetic basis and discovering biological insights of a phenotype. With this understanding, we expect to see SIGNET as a valuable tool for integrative GWAS analysis, thereby boosting the prevention, diagnosis, and treatment of human inherited diseases and eventually facilitating precision medicine.

    View details for PubMedID 29300920

  • Differential regulation enrichment analysis via the integration of transcriptional regulatory network and gene expression data. Bioinformatics Ma, S., Jiang, T., Jiang, R. 2015; 31 (4): 563-571


    Although many gene set analysis methods have been proposed to explore associations between a phenotype and a group of genes sharing common biological functions or involved in the same biological process, the underlying biological mechanisms of identified gene sets are typically unexplained.We propose a method called Differential Regulation-based enrichment Analysis for GENe sets (DRAGEN) to identify gene sets in which a significant proportion of genes have their transcriptional regulatory patterns changed in a perturbed phenotype. We conduct comprehensive simulation studies to demonstrate the capability of our method in identifying differentially regulated gene sets. We further apply our method to three human microarray expression datasets, two with hormone treated and control samples and one concerning different cell cycle phases. Results indicate that the capability of DRAGEN in identifying phenotype-associated gene sets is significantly superior to those of four existing methods for analyzing differentially expressed gene sets. We conclude that the proposed differential regulation enrichment analysis method, though exploratory in nature, complements the existing gene set analysis methods and provides a promising new direction for the interpretation of gene expression data.The program of DRAGEN is freely available at or jiang@cs.ucr.eduSupplementary data are available at Bioinformatics online.

    View details for DOI 10.1093/bioinformatics/btu672

    View details for PubMedID 25322838