PhD., Yale University, Genome-wide mapping and functional analysis of copy number variation in the human genome (2012)
MSc., Yale University, Molecular, Cellular and Developmental Biology (2008)
S.B., Massachusetts Institute of Technology, Mathematics (2005)
S.B., Massachusetts Institutes of Technology, Biology (2005)
Community and International Work
Rare Genomics Institute
Scientific Affairs, Strategic Alliances
Opportunities for Student Involvement
Current Research and Scholarly Interests
Using high resolution methods to map copy number variants (CNVs) in human genomes for establishing CNV associations to certain phenotypes.
Mapping genetic variants in the human genome using next generation sequencing technologies and integration of variants to deduce phenotypically relevant biological pathways.
Impacts of variation in the human genome on gene regulation.
Journal of molecular biology
2013; 425 (21): 3970-3977
Recent advances in fast and inexpensive DNA sequencing have enabled the extensive study of genomic and transciptomic variation in humans. Human genomic variation is composed of sequence and structural changes including single-nucleotide and multinucleotide variants, short insertions or deletions (indels), larger copy number variants, and similarly sized copy neutral inversions and translocations. It is now well established that any two genomes differ extensively and that structural changes constitute the most prominent source of this variation. There have also been major technological advances in RNA sequencing to globally quantify and describe diversity in transcripts. Large consortia such as the 1000 Genomes Project and the Enclyclopedia of DNA Elements Project are producing increasingly comphrehensive maps outlining the regions of the human genome containing variants and functional elements, respectively. Integration of genetic variation data and extensive annotation of functional genomic elements, along with the ability to measure global transcription, allow the impacts of genetic variants on gene expression to be resolved. There are several well-established models by which genetic variants affect gene regulation depending on the type, nature, and position of the variant with respect to the affected genes. These effects can be manifested in two ways: changes to transcript sequences and isoforms by coding variants, and changes to transcript abundance by dosage or regulatory variants. Here, we review the current state of how genetic variations impact gene regulation locally and globally in the human genome.
View details for DOI 10.1016/j.jmb.2013.07.015
View details for PubMedID 23871684
Computational and Bioinformatics Frameworks for Next-Generation Whole Exome and Genome Sequencing
SCIENTIFIC WORLD JOURNAL
It has become increasingly apparent that one of the major hurdles in the genomic age will be the bioinformatics challenges of next-generation sequencing. We provide an overview of a general framework of bioinformatics analysis. For each of the three stages of (1) alignment, (2) variant calling, and (3) filtering and annotation, we describe the analysis required and survey the different software packages that are used. Furthermore, we discuss possible future developments as data sources grow and highlight opportunities for new bioinformatics tools to be developed.
View details for DOI 10.1155/2013/730210
View details for Web of Science ID 000314128300001
View details for PubMedID 23365548
Child Development and Structural Variation in the Human Genome
2013; 84 (1): 34-48
Structural variation of the human genome sequence is the insertion, deletion, or rearrangement of stretches of DNA sequence sized from around 1,000 to millions of base pairs. Over the past few years, structural variation has been shown to be far more common in human genomes than previously thought. Very little is currently known about the effects of structural variation on normal child development, but such effects could be of considerable significance. This review provides an overview of the phenomenon of structural variation in the human genome sequence, describing the novel genomics technologies that are revolutionizing the way structural variation is studied and giving examples of genomic structural variations that affect child development.
View details for DOI 10.1111/cdev.12051
View details for Web of Science ID 000314112000003
View details for PubMedID 23311762
- Personalizing rare disease research: how genomics is revolutionizing the diagnosis and treatment of rare disease PERSONALIZED MEDICINE 2012; 9 (8): 805-819
Personal Omics Profiling Reveals Dynamic Molecular and Medical Phenotypes
2012; 148 (6): 1293-1307
Personalized medicine is expected to benefit from combining genomic information with regular monitoring of physiological states by multiple high-throughput methods. Here, we present an integrative personal omics profile (iPOP), an analysis that combines genomic, transcriptomic, proteomic, metabolomic, and autoantibody profiles from a single individual over a 14 month period. Our iPOP analysis revealed various medical risks, including type 2 diabetes. It also uncovered extensive, dynamic changes in diverse molecular components and biological pathways across healthy and diseased conditions. Extremely high-coverage genomic and transcriptomic data, which provide the basis of our iPOP, revealed extensive heteroallelic changes during healthy and diseased states and an unexpected RNA editing mechanism. This study demonstrates that longitudinal iPOP can be used to interpret healthy and diseased states by connecting genomic information with additional dynamic omics activity.
View details for DOI 10.1016/j.cell.2012.02.009
View details for Web of Science ID 000301889500023
View details for PubMedID 22424236
- Detecting and annotating genetic variations using the HugeSeq pipeline NATURE BIOTECHNOLOGY 2012; 30 (3): 226-229
Genome-Wide Mapping of Copy Number Variation in Humans: Comparative Analysis of High Resolution Array Platforms
2011; 6 (11)
Accurate and efficient genome-wide detection of copy number variants (CNVs) is essential for understanding human genomic variation, genome-wide CNV association type studies, cytogenetics research and diagnostics, and independent validation of CNVs identified from sequencing based technologies. Numerous, array-based platforms for CNV detection exist utilizing array Comparative Genome Hybridization (aCGH), Single Nucleotide Polymorphism (SNP) genotyping or both. We have quantitatively assessed the abilities of twelve leading genome-wide CNV detection platforms to accurately detect Gold Standard sets of CNVs in the genome of HapMap CEU sample NA12878, and found significant differences in performance. The technologies analyzed were the NimbleGen 4.2 M, 2.1 M and 3×720 K Whole Genome and CNV focused arrays, the Agilent 1×1 M CGH and High Resolution and 2×400 K CNV and SNP+CGH arrays, the Illumina Human Omni1Quad array and the Affymetrix SNP 6.0 array. The Gold Standards used were a 1000 Genomes Project sequencing-based set of 3997 validated CNVs and an ultra high-resolution aCGH-based set of 756 validated CNVs. We found that sensitivity, total number, size range and breakpoint resolution of CNV calls were highest for CNV focused arrays. Our results are important for cost effective CNV detection and validation for both basic and clinical applications.
View details for DOI 10.1371/journal.pone.0027859
View details for Web of Science ID 000298168100021
View details for PubMedID 22140474