Whole-genome risk prediction of common diseases in human preimplantation embryos
2022; 28 (3): 513-+
Preimplantation genetic testing (PGT) of in-vitro-fertilized embryos has been proposed as a method to reduce transmission of common disease; however, more comprehensive embryo genetic assessment, combining the effects of common variants and rare variants, remains unavailable. Here, we used a combination of molecular and statistical techniques to reliably infer inherited genome sequence in 110 embryos and model susceptibility across 12 common conditions. We observed a genotype accuracy of 99.0-99.4% at sites relevant to polygenic risk scoring in cases from day-5 embryo biopsies and 97.2-99.1% in cases from day-3 embryo biopsies. Combining rare variants with polygenic risk score (PRS) magnifies predicted differences across sibling embryos. For example, in a couple with a pathogenic BRCA1 variant, we predicted a 15-fold difference in odds ratio (OR) across siblings when combining versus a 4.5-fold or 3-fold difference with BRCA1 or PRS alone. Our findings may inform the discussion of utility and implementation of genome-based PGT in clinical practice.
View details for DOI 10.1038/s41591-022-01735-0
View details for Web of Science ID 000771673100027
View details for PubMedID 35314819
- Extracutaneous manifestations in phacomatosis cesioflammea and cesiomarmorata: Case series and literature review AMERICAN JOURNAL OF MEDICAL GENETICS PART A 2019; 179 (6): 966–77
Extracutaneous manifestations in phacomatosis cesioflammea and cesiomarmorata: Case series and literature review.
American journal of medical genetics. Part A
Phacomatosis pigmentovascularis (PPV) comprises a family of rare conditions that feature vascular abnormalities and melanocytic lesions that can be solely cutaneous or multisystem in nature. Recently published work has demonstrated that both vascular and melanocytic abnormalities in PPV of the cesioflammea and cesiomarmorata subtypes can result from identical somatic mosaic activating mutations in the genes GNAQ and GNA11. Here, we present three new cases of PPV with features of the cesioflammea and/or cesiomarmorata subtypes and mosaic mutations in GNAQ or GNA11. To better understand the risk of potentially occult complications faced by such patients we additionally reviewed 176 cases published in the literature. We report the frequency of clinical findings, their patterns of co-occurrence as well as published recommendations for surveillance after diagnosis. Features assessed include: capillary malformation; dermal and ocular melanocytosis; glaucoma; limb asymmetry; venous malformations; and central nervous system (CNS) anomalies, such as ventriculomegaly and calcifications. We found that ocular findings are common in patients with phacomatosis cesioflammea and cesiomarmorata. Facial vascular involvement correlates with a higher risk of seizures (p=.0066). Our genetic results confirm the role of mosaic somatic mutations in GNAQ and GNA11 in phacomatosis cesioflammea and cesiomarmorata. Their clinical and molecular findings place these conditions on a clinical spectrum encompassing other GNAQ and GNA11 related disorders and inform recommendations for their management.
View details for PubMedID 30920161
The Rhododendron genome and chromosomal organization provide insight into shared whole genome duplications across the heath family (Ericaceae).
Genome biology and evolution
The genus Rhododendron (Ericaceae), which includes horticulturally important plants such as azaleas, is a highly diverse and widely distributed genus of more than 1,000 species. Here, we report the chromosome-scale de novo assembly and genome annotation of Rhododendron williamsianum as a basis for continued study of this large genus. We created multiple short fragment genomic libraries, which were assembled using ALLPATHS-LG. This was followed by contiguity preserving transposase sequencing (CPT-seq) and fragScaff scaffolding of a large fragment library, which improved the assembly by decreasing the number of scaffolds and increasing scaffold length. Chromosome-scale scaffolding was performed by proximity-guided assembly (LACHESIS) using chromatin conformation capture (Hi-C) data. Chromosome-scale scaffolding was further refined and linkage groups defined by restriction-site associated DNA (RAD) sequencing of the parents and progeny of a genetic cross. The resulting linkage map confirmed the LACHESIS clustering and ordering of scaffolds onto chromosomes and rectified large-scale inversions. Assessments of the R. williamsianum genome assembly and gene annotation estimate them to be 89 and 79% complete, respectively. Predicted coding sequences from genome annotation were used in syntenic analyses and for generating age distributions of synonymous substitutions/site between paralogous gene pairs, which identified whole genome duplications (WGDs) in R. williamsianum. We then analyzed other publicly available Ericaceae genomes for shared WGDs. Based on our spatial and temporal analyses of paralogous gene pairs, we find evidence for two shared, ancient WGDs in Rhododendron and Vaccinium (cranberry/blueberry) members that predate the Ericaceae family and, in one case, the Ericales order.
View details for DOI 10.1093/gbe/evz245
View details for PubMedID 31702783
Substantial interindividual and limited intraindividual genomic diversity among tumors from men with metastatic prostate cancer
2016; 22 (4): 369-?
Tumor heterogeneity may reduce the efficacy of molecularly guided systemic therapy for cancers that have metastasized. To determine whether the genomic alterations in a single metastasis provide a reasonable assessment of the major oncogenic drivers of other dispersed metastases in an individual, we analyzed multiple tumors from men with disseminated prostate cancer through whole-exome sequencing, array comparative genomic hybridization (CGH) and RNA transcript profiling, and we compared the genomic diversity within and between individuals. In contrast to the substantial heterogeneity between men, there was limited diversity among metastases within an individual. The number of somatic mutations, the burden of genomic copy number alterations and aberrations in known oncogenic drivers were all highly concordant, as were metrics of androgen receptor (AR) activity and cell cycle activity. AR activity was inversely associated with cell proliferation, whereas the expression of Fanconi anemia (FA)-complex genes was correlated with elevated cell cycle progression, expression of the E2F transcription factor 1 (E2F1) and loss of retinoblastoma 1 (RB1). Men with somatic aberrations in FA-complex genes or in ATM serine/threonine kinase (ATM) exhibited significantly longer treatment-response durations to carboplatin than did men without defects in genes encoding DNA-repair proteins. Collectively, these data indicate that although exceptions exist, evaluating a single metastasis provides a reasonable assessment of the major oncogenic driver alterations that are present in disseminated tumors within an individual, and thus may be useful for selecting treatments on the basis of predicted molecular vulnerabilities.
View details for DOI 10.1038/nm.4053
View details for Web of Science ID 000373457700012
View details for PubMedID 26928463
View details for PubMedCentralID PMC5045679
Recurrent Somatic Loss of TNFRSF14 in Classical Hodgkin Lymphoma
GENES CHROMOSOMES & CANCER
2016; 55 (3): 278-287
Investigation of the genetic lesions underlying classical Hodgkin lymphoma (CHL) has been challenging due to the rarity of Hodgkin and Reed-Sternberg (HRS) cells, the pathognomonic neoplastic cells of CHL. In an effort to catalog more comprehensively recurrent copy number alterations occurring during oncogenesis, we investigated somatic alterations involved in CHL using whole-genome sequencing-mediated copy number analysis of purified HRS cells. We performed low-coverage sequencing of small numbers of intact HRS cells and paired non-neoplastic B lymphocytes isolated by flow cytometric cell sorting from 19 primary cases, as well as two commonly used HRS-derived cell lines (KM-H2 and L1236). We found that HRS cells contain strikingly fewer copy number abnormalities than CHL cell lines. A subset of cases displayed nonintegral chromosomal copy number states, suggesting internal heterogeneity within the HRS cell population. Recurrent somatic copy number alterations involving known factors in CHL pathogenesis were identified (REL, the PD-1 pathway, and TNFAIP3). In eight cases (42%) we observed recurrent copy number loss of chr1:2,352,236-4,574,271, a region containing the candidate tumor suppressor TNFRSF14. Using flow cytometry, we demonstrated reduced TNFRSF14 expression in HRS cells from 5 of 22 additional cases (23%) and in two of three CHL cell lines. These studies suggest that TNFRSF14 dysregulation may contribute to the pathobiology of CHL in a subset of cases.
View details for DOI 10.1002/gcc.22331
View details for Web of Science ID 000368260100008
View details for PubMedID 26650888
View details for PubMedCentralID PMC4713316
Whole genome prediction for preimplantation genetic diagnosis
Preimplantation genetic diagnosis (PGD) enables profiling of embryos for genetic disorders prior to implantation. The majority of PGD testing is restricted in the scope of variants assayed or by the availability of extended family members. While recent advances in single cell sequencing show promise, they remain limited by bias in DNA amplification and the rapid turnaround time (<36 h) required for fresh embryo transfer. Here, we describe and validate a method for inferring the inherited whole genome sequence of an embryo for preimplantation genetic diagnosis (PGD).We combine haplotype-resolved, parental genome sequencing with rapid embryo genotyping to predict the whole genome sequence of a day-5 human embryo in a couple at risk of transmitting alpha-thalassemia.Inheritance was predicted at approximately 3 million paternally and/or maternally heterozygous sites with greater than 99% accuracy. Furthermore, we successfully phase and predict the transmission of an HBA1/HBA2 deletion from each parent.Our results suggest that preimplantation whole genome prediction may facilitate the comprehensive diagnosis of diseases with a known genetic basis in embryos.
View details for DOI 10.1186/s13073-015-0160-4
View details for Web of Science ID 000355074400001
View details for PubMedID 26019723
View details for PubMedCentralID PMC4445980
In vitro, long-range sequence information for de novo genome assembly via transposase contiguity.
2014; 24 (12): 2041-9
We describe a method that exploits contiguity preserving transposase sequencing (CPT-seq) to facilitate the scaffolding of de novo genome assemblies. CPT-seq is an entirely in vitro means of generating libraries comprised of 9216 indexed pools, each of which contains thousands of sparsely sequenced long fragments ranging from 5 kilobases to > 1 megabase. These pools are "subhaploid," in that the lengths of fragments contained in each pool sums to ∼5% to 10% of the full genome. The scaffolding approach described here, termed fragScaff, leverages coincidences between the content of different pools as a source of contiguity information. Specifically, CPT-seq data is mapped to a de novo genome assembly, followed by the identification of pairs of contigs or scaffolds whose ends disproportionately co-occur in the same indexed pools, consistent with true adjacency in the genome. Such candidate "joins" are used to construct a graph, which is then resolved by a minimum spanning tree. As a proof-of-concept, we apply CPT-seq and fragScaff to substantially boost the contiguity of de novo assemblies of the human, mouse, and fly genomes, increasing the scaffold N50 of de novo assemblies by eight- to 57-fold with high accuracy. We also demonstrate that fragScaff is complementary to Hi-C-based contact probability maps, providing midrange contiguity to support robust, accurate chromosome-scale de novo genome assemblies without the need for laborious in vivo cloning steps. Finally, we demonstrate CPT-seq as a means of anchoring unplaced novel human contigs to the reference genome as well as for detecting misassembled sequences.
View details for DOI 10.1101/gr.178319.114
View details for PubMedID 25327137
View details for PubMedCentralID PMC4248320
MIPgen: optimized modeling and design of molecular inversion probes for targeted resequencing
2014; 30 (18): 2670-2672
Molecular inversion probes (MIPs) enable cost-effective multiplex targeted gene resequencing in large cohorts. However, the design of individual MIPs is a critical parameter governing the performance of this technology with respect to capture uniformity and specificity. MIPgen is a user-friendly package that simplifies the process of designing custom MIP assays to arbitrary targets. New logistic and SVM-derived models enable in silico predictions of assay success, and assay redesign exhibits improved coverage uniformity relative to previous methods, which in turn improves the utility of MIPs for cost-effective targeted sequencing for candidate gene validation and for diagnostic sequencing in a clinical setting.MIPgen is implemented in C++. Source code and accompanying Python scripts are available at http://shendurelab.github.io/MIPGEN/.
View details for DOI 10.1093/bioinformatics/btu353
View details for Web of Science ID 000342913000022
View details for PubMedID 24867941
View details for PubMedCentralID PMC4155255
Genome Sequencing of Idiopathic Pulmonary Fibrosis in Conjunction with a Medical School Human Anatomy Course
2014; 9 (9)
Even in cases where there is no obvious family history of disease, genome sequencing may contribute to clinical diagnosis and management. Clinical application of the genome has not yet become routine, however, in part because physicians are still learning how best to utilize such information. As an educational research exercise performed in conjunction with our medical school human anatomy course, we explored the potential utility of determining the whole genome sequence of a patient who had died following a clinical diagnosis of idiopathic pulmonary fibrosis (IPF). Medical students performed dissection and whole genome sequencing of the cadaver. Gross and microscopic findings were more consistent with the fibrosing variant of nonspecific interstitial pneumonia (NSIP), as opposed to IPF per se. Variants in genes causing Mendelian disorders predisposing to IPF were not detected. However, whole genome sequencing identified several common variants associated with IPF, including a single nucleotide polymorphism (SNP), rs35705950, located in the promoter region of the gene encoding mucin glycoprotein MUC5B. The MUC5B promoter polymorphism was recently found to markedly elevate risk for IPF, though a particular association with NSIP has not been previously reported, nor has its contribution to disease risk previously been evaluated in the genome-wide context of all genetic variants. We did not identify additional predicted functional variants in a region of linkage disequilibrium (LD) adjacent to MUC5B, nor did we discover other likely risk-contributing variants elsewhere in the genome. Whole genome sequencing thus corroborates the association of rs35705950 with MUC5B dysregulation and interstitial lung disease. This novel exercise additionally served a unique mission in bridging clinical and basic science education.
View details for DOI 10.1371/journal.pone.0106744
View details for Web of Science ID 000347993600058
View details for PubMedID 25192356
View details for PubMedCentralID PMC4156421
Complex MSH2 and MSH6 mutations in hypermutated microsatellite unstable advanced prostate cancer
A hypermutated subtype of advanced prostate cancer was recently described, but prevalence and mechanisms have not been well-characterized. Here we find that 12% (7 of 60) of advanced prostate cancers are hypermutated, and that all hypermutated cancers have mismatch repair gene mutations and microsatellite instability (MSI). Mutations are frequently complex MSH2 or MSH6 structural rearrangements rather than MLH1 epigenetic silencing. Our findings identify parallels and differences in the mechanisms of hypermutation in prostate cancer compared with other MSI-associated cancers.
View details for DOI 10.1038/ncomms5988
View details for Web of Science ID 000342985400002
View details for PubMedID 25255306
View details for PubMedCentralID PMC4176888
Deep sequencing of multiple regions of glial tumors reveals spatial heterogeneity for mutations in clinically relevant genes
2014; 15 (12)
The extent of intratumoral mutational heterogeneity remains unclear in gliomas, the most common primary brain tumors, especially with respect to point mutation. To address this, we applied single molecule molecular inversion probes targeting 33 cancer genes to assay both point mutations and gene amplifications within spatially distinct regions of 14 glial tumors.We find evidence of regional mutational heterogeneity in multiple tumors, including mutations in TP53 and RB1 in an anaplastic oligodendroglioma and amplifications in PDGFRA and KIT in two glioblastomas (GBMs). Immunohistochemistry confirms heterogeneity of TP53 mutation and PDGFRA amplification. In all, 3 out of 14 glial tumors surveyed have evidence for heterogeneity for clinically relevant mutations.Our results underscore the need to sample multiple regions in GBM and other glial tumors when devising personalized treatments based on genomic information, and furthermore demonstrate the importance of measuring both point mutation and copy number alteration while investigating genetic heterogeneity within cancer samples.
View details for DOI 10.1186/s13059-014-0530-z
View details for Web of Science ID 000346609500010
View details for PubMedCentralID PMC4272528
Germline Missense Variants in the BTNL2 Gene Are Associated with Prostate Cancer Susceptibility.
Cancer epidemiology, biomarkers & prevention : a publication of the American Association for Cancer Research, cosponsored by the American Society of Preventive Oncology
2013; 22 (9): 1520-1528
Rare, inherited mutations account for 5% to 10% of all prostate cancer cases. However, to date, few causative mutations have been identified.To identify rare mutations for prostate cancer, we conducted whole-exome sequencing (WES) in multiple kindreds (n = 91) from 19 hereditary prostate cancer (HPC) families characterized by aggressive or early-onset phenotypes. Candidate variants (n = 130) identified through family- and bioinformatics-based filtering of WES data were then genotyped in an independent set of 270 HPC families (n = 819 prostate cancer cases; n = 496 unaffected relatives) for replication. Two variants with supportive evidence were subsequently genotyped in a population-based case-control study (n = 1,155 incident prostate cancer cases; n = 1,060 age-matched controls) for further confirmation. All participants were men of European ancestry.The strongest evidence was for two germline missense variants in the butyrophilin-like 2 (BTNL2) gene (rs41441651, p.Asp336Asn and rs28362675, p.Gly454Cys) that segregated with affection status in two of the WES families. In the independent set of 270 HPC families, 1.5% (rs41441651; P = 0.0032) and 1.2% (rs28362675; P = 0.0070) of affected men, but no unaffected men, carried a variant. Both variants were associated with elevated prostate cancer risk in the population-based study (rs41441651: OR, 2.7; 95% CI, 1.27-5.87; P = 0.010; rs28362675: OR, 2.5; 95% CI, 1.16-5.46; P = 0.019).Results indicate that rare BTNL2 variants play a role in susceptibility to both familial and sporadic prostate cancer.Results implicate BTNL2 as a novel prostate cancer susceptibility gene.
View details for DOI 10.1158/1055-9965.EPI-13-0345
View details for PubMedID 23833122
View details for PubMedCentralID PMC3769499
Multiplex Targeted Sequencing Identifies Recurrently Mutated Genes in Autism Spectrum Disorders
2012; 338 (6114): 1619-1622
Exome sequencing studies of autism spectrum disorders (ASDs) have identified many de novo mutations but few recurrently disrupted genes. We therefore developed a modified molecular inversion probe method enabling ultra-low-cost candidate gene resequencing in very large cohorts. To demonstrate the power of this approach, we captured and sequenced 44 candidate genes in 2446 ASD probands. We discovered 27 de novo events in 16 genes, 59% of which are predicted to truncate proteins or disrupt splicing. We estimate that recurrent disruptive mutations in six genes-CHD8, DYRK1A, GRIN2B, TBR1, PTEN, and TBL1XR1-may contribute to 1% of sporadic ASDs. Our data support associations between specific genes and reciprocal subphenotypes (CHD8-macrocephaly and DYRK1A-microcephaly) and replicate the importance of a β-catenin-chromatin-remodeling network to ASD etiology.
View details for DOI 10.1126/science.1227764
View details for Web of Science ID 000312533100053
View details for PubMedID 23160955
View details for PubMedCentralID PMC3528801
Exome sequencing identifies a spectrum of mutation frequencies in advanced and lethal prostate cancers
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA
2011; 108 (41): 17087-17092
To catalog protein-altering mutations that may drive the development of prostate cancers and their progression to metastatic disease systematically, we performed whole-exome sequencing of 23 prostate cancers derived from 16 different lethal metastatic tumors and three high-grade primary carcinomas. All tumors were propagated in mice as xenografts, designated the LuCaP series, to model phenotypic variation, such as responses to cancer-directed therapeutics. Although corresponding normal tissue was not available for most tumors, we were able to take advantage of increasingly deep catalogs of human genetic variation to remove most germline variants. On average, each tumor genome contained ~200 novel nonsynonymous variants, of which the vast majority was specific to individual carcinomas. A subset of genes was recurrently altered across tumors derived from different individuals, including TP53, DLK2, GPC6, and SDF4. Unexpectedly, three prostate cancer genomes exhibited substantially higher mutation frequencies, with 2,000-4,000 novel coding variants per exome. A comparison of castration-resistant and castration-sensitive pairs of tumor lines derived from the same prostate cancer highlights mutations in the Wnt pathway as potentially contributing to the development of castration resistance. Collectively, our results indicate that point mutations arising in coding regions of advanced prostate cancers are common but, with notable exceptions, very few genes are mutated in a substantial fraction of tumors. We also report a previously undescribed subtype of prostate cancers exhibiting "hypermutated" genomes, with potential implications for resistance to cancer therapeutics. Our results also suggest that increasingly deep catalogs of human germline variation may challenge the necessity of sequencing matched tumor-normal pairs.
View details for DOI 10.1073/pnas.1108745108
View details for Web of Science ID 000295973800045
View details for PubMedID 21949389
View details for PubMedCentralID PMC3193229
Target-enrichment strategies for next-generation sequencing
2010; 7 (2): 111-118
We have not yet reached a point at which routine sequencing of large numbers of whole eukaryotic genomes is feasible, and so it is often necessary to select genomic regions of interest and to enrich these regions before sequencing. There are several enrichment approaches, each with unique advantages and disadvantages. Here we describe our experiences with the leading target-enrichment technologies, the optimizations that we have performed and typical results that can be obtained using each. We also provide detailed protocols for each technology so that end users can find the best compromise between sensitivity, specificity and uniformity for their particular project.
View details for DOI 10.1038/nmeth.1419
View details for Web of Science ID 000274086200015
View details for PubMedID 20111037