- Cancer > GI Oncology
- Medical Oncology
- Oncology (Cancer)
- Gastrointestinal Neoplasms
- Inherited Cancer Disorders
- Immunotherapy in gastrointestinal cancers
Senior Associate Director, Stanford Genome Technology Center (2008 - Present)
Honors & Awards
Physician-Scientist Fellowship Award, Howard Hughes Medical Institute (1998)
American Association Cancer Research, Scholar-in-Training Award for Research Achievement (2005)
Merit Award for Research Achievement, American Society Clinical Oncology Foundation (2006)
Physician Scientist Early Career Award, Howard Hughes Medical Institute (2008)
Clinical Scientist Development Award, Doris Duke Charitable Foundation (2009)
Research Scholar Award, American Cancer Society (2013)
Residency:University Of Iowa Hospitals and Clinics GME Training Verifications (1996) IA
Medical Education:Johns Hopkins University School of Medicine (1994) MD
Fellowship:Stanford University Hospital -Clinical Excellence Research Center (2005) CA
Board Certification: Medical Oncology, American Board of Internal Medicine (2004)
Residency:University of Washington (2001) WA
B.A., Reed College, Biology
M.D., Johns Hopkins University, Medicine
Current Research and Scholarly Interests
To improve the lives of individuals with cancer, our research group has embarked on a research initiative to use cutting edge genetics and technology to interrogate the fundamental genetic" digital" code responsible for cancer development and overall clinical behavior.
We are pursuing projects focused on personalized medicine. Specifically, we are interested in using genetic and genomic approaches in oncology to improve targeted cancer therapy development, make accurate prognosis, prediction of cancer therapy efficacy and identify clinically relevant cancer mutations. These projects are aimed towards establishing the paradigm for individualized medicine, facilitate the introduction of these approaches into validation clinical studies and thus develop the next generation of cancer diagnostics and treatment.
Our research program is specifically focused on:
1) Discovery and validation of genetic signatures portending prognosis and therapeutic drug targets for individuals with cancer
2) Development of novel approaches for analyzing cancer genomes and identifying personalized therapeutic targets
3) Determining inherited pathogenic mutations that increase the risk of developing gastrointestinal malignancies
4) The genetic analysis of complete cancer genome sequences derived from inherited cancer
5) Technology development on novel genetic diagnostic methods to help individuals with cancer
Clinical & Pathological Studies of Upper Gastrointestinal Carcinoma
Our research of the biology of upper gastrointestinal cancers involves the study of tissue samples and cells from biopsies of persons with gastric or esophageal cancer or blood samples from upper gastrointestinal cancer patients and persons at high inherited risk for these cancers. We hope to learn the role genes and proteins play in the development of gastric and esophageal cancer.
The Gastric Cancer Foundation: A Gastric Cancer Registry
The Gastric Cancer Registry will combine data acquired directly from patients with gastric cancer via an online questionnaire with genomic data obtained from blood and tissue samples. The purpose of this registry is to gain better understanding of the causes of gastric cancer, both environmental and genetic; whether certain genomic data can predict outcomes of treatment and survival; as well as explore the issues that effect the quality of life of these patients after diagnosis and treatment.
Independent Studies (8)
- Biomedical Informatics Teaching Methods
BIOMEDIN 290 (Aut, Win, Spr, Sum)
- Directed Reading and Research
BIOMEDIN 299 (Aut, Win, Spr, Sum)
- Directed Reading in Medicine
MED 299 (Aut, Win, Spr, Sum)
- Early Clinical Experience in Medicine
MED 280 (Aut, Win, Spr, Sum)
- Graduate Research
MED 399 (Aut, Win, Spr, Sum)
- Medical Scholars Research
BIOMEDIN 370 (Aut, Win, Spr, Sum)
- Medical Scholars Research
MED 370 (Aut, Win, Spr, Sum)
- Undergraduate Research
MED 199 (Aut, Win, Spr, Sum)
- Biomedical Informatics Teaching Methods
Genomic Instability in Cancer: Teetering on the Limit of Tolerance
2017; 77 (9): 2179-2185
Cancer genomic instability contributes to the phenomenon of intratumoral genetic heterogeneity, provides the genetic diversity required for natural selection, and enables the extensive phenotypic diversity that is frequently observed among patients. Genomic instability has previously been associated with poor prognosis. However, we have evidence that for solid tumors of epithelial origin, extreme levels of genomic instability, where more than 75% of the genome is subject to somatic copy number alterations, are associated with a potentially better prognosis compared with intermediate levels under this threshold. This has been observed in clonal subpopulations of larger size, especially when genomic instability is shared among a limited number of clones. We hypothesize that cancers with extreme levels of genomic instability may be teetering on the brink of a threshold where so much of their genome is adversely altered that cells rarely replicate successfully. Another possibility is that tumors with high levels of genomic instability are more immunogenic than other cancers with a less extensive burden of genetic aberrations. Regardless of the exact mechanism, but hinging on our ability to quantify how a tumor's burden of genetic aberrations is distributed among coexisting clones, genomic instability has important therapeutic implications. Herein, we explore the possibility that a high genomic instability could be the basis for a tumor's sensitivity to DNA-damaging therapies. We primarily focus on studies of epithelial-derived solid tumors. Cancer Res; 77(9); 2179-85. ©2017 AACR.
View details for DOI 10.1158/0008-5472.CAN-16-1553
View details for Web of Science ID 000400270100001
View details for PubMedID 28432052
View details for PubMedCentralID PMC5413432
CRISPR-Cas9-targeted fragmentation and selective sequencing enable massively parallel microsatellite analysis
Microsatellites are multi-allelic and composed of short tandem repeats (STRs) with individual motifs composed of mononucleotides, dinucleotides or higher including hexamers. Next-generation sequencing approaches and other STR assays rely on a limited number of PCR amplicons, typically in the tens. Here, we demonstrate STR-Seq, a next-generation sequencing technology that analyses over 2,000 STRs in parallel, and provides the accurate genotyping of microsatellites. STR-Seq employs in vitro CRISPR-Cas9-targeted fragmentation to produce specific DNA molecules covering the complete microsatellite sequence. Amplification-free library preparation provides single molecule sequences without unique molecular barcodes. STR-selective primers enable massively parallel, targeted sequencing of large STR sets. Overall, STR-Seq has higher throughput, improved accuracy and provides a greater number of informative haplotypes compared with other microsatellite analysis approaches. With these new features, STR-Seq can identify a 0.1% minor genome fraction in a DNA mixture composed of different, unrelated samples.
View details for DOI 10.1038/ncomms14291
View details for Web of Science ID 000393379700001
View details for PubMedID 28169275
View details for PubMedCentralID PMC5309709
Haplotyping germline and cancer genomes with high-throughput linked-read sequencing.
2016; 34 (3): 303-311
Haplotyping of human chromosomes is a prerequisite for cataloguing the full repertoire of genetic variation. We present a microfluidics-based, linked-read sequencing technology that can phase and haplotype germline and cancer genomes using nanograms of input DNA. This high-throughput platform prepares barcoded libraries for short-read sequencing and computationally reconstructs long-range haplotype and structural variant information. We generate haplotype blocks in a nuclear trio that are concordant with expected inheritance patterns and phase a set of structural variants. We also resolve the structure of the EML4-ALK gene fusion in the NCI-H2228 cancer cell line using phased exome sequencing. Finally, we assign genetic aberrations to specific megabase-scale haplotypes generated from whole-genome sequencing of a primary colorectal adenocarcinoma. This approach resolves haplotype information using up to 100 times less genomic DNA than some methods and enables the accurate detection of structural variants.
View details for DOI 10.1038/nbt.3432
View details for PubMedID 26829319
View details for PubMedCentralID PMC4786454
Pan-cancer analysis of the extent and consequences of intratumor heterogeneity.
2016; 22 (1): 105-113
Intratumor heterogeneity (ITH) drives neoplastic progression and therapeutic resistance. We used the bioinformatics tools 'expanding ploidy and allele frequency on nested subpopulations' (EXPANDS) and PyClone to detect clones that are present at a ≥10% frequency in 1,165 exome sequences from tumors in The Cancer Genome Atlas. 86% of tumors across 12 cancer types had at least two clones. ITH in the morphology of nuclei was associated with genetic ITH (Spearman's correlation coefficient, ρ = 0.24-0.41; P < 0.001). Mutation of a driver gene that typically appears in smaller clones was a survival risk factor (hazard ratio (HR) = 2.15, 95% confidence interval (CI): 1.71-2.69). The risk of mortality also increased when >2 clones coexisted in the same tumor sample (HR = 1.49, 95% CI: 1.20-1.87). In two independent data sets, copy-number alterations affecting either <25% or >75% of a tumor's genome predicted reduced risk (HR = 0.15, 95% CI: 0.08-0.29). Mortality risk also declined when >4 clones coexisted in the sample, suggesting a trade-off between the costs and benefits of genomic instability. ITH and genomic instability thus have the potential to be useful measures that can universally be applied to all cancers.
View details for DOI 10.1038/nm.3984
View details for PubMedID 26618723
The Cancer Genome Atlas Clinical Explorer: a web and mobile interface for identifying clinical-genomic driver associations.
2015; 7 (1): 112-?
The Cancer Genome Atlas (TCGA) project has generated genomic data sets covering over 20 malignancies. These data provide valuable insights into the underlying genetic and genomic basis of cancer. However, exploring the relationship among TCGA genomic results and clinical phenotype remains a challenge, particularly for individuals lacking formal bioinformatics training. Overcoming this hurdle is an important step toward the wider clinical translation of cancer genomic/proteomic data and implementation of precision cancer medicine. Several websites such as the cBio portal or University of California Santa Cruz genome browser make TCGA data accessible but lack interactive features for querying clinically relevant phenotypic associations with cancer drivers. To enable exploration of the clinical-genomic driver associations from TCGA data, we developed the Cancer Genome Atlas Clinical Explorer.The Cancer Genome Atlas Clinical Explorer interface provides a straightforward platform to query TCGA data using one of the following methods: (1) searching for clinically relevant genes, micro RNAs, and proteins by name, cancer types, or clinical parameters; (2) searching for genomic/proteomic profile changes by clinical parameters in a cancer type; or (3) testing two-hit hypotheses. SQL queries run in the background and results are displayed on our portal in an easy-to-navigate interface according to user's input. To derive these associations, we relied on elastic-net estimates of optimal multiple linear regularized regression and clinical parameters in the space of multiple genomic/proteomic features provided by TCGA data. Moreover, we identified and ranked gene/micro RNA/protein predictors of each clinical parameter for each cancer. The robustness of the results was estimated by bootstrapping. Overall, we identify associations of potential clinical relevance among genes/micro RNAs/proteins using our statistical analysis from 25 cancer types and 18 clinical parameters that include clinical stage or smoking history.The Cancer Genome Atlas Clinical Explorer enables the cancer research community and others to explore clinically relevant associations inferred from TCGA data. With its accessible web and mobile interface, users can examine queries and test hypothesis regarding genomic/proteomic alterations across a broad spectrum of malignancies.
View details for DOI 10.1186/s13073-015-0226-3
View details for PubMedID 26507825
Emergence of Hemagglutinin Mutations During the Course of Influenza Infection.
2015; 5: 16178-?
Influenza remains a significant cause of disease mortality. The ongoing threat of influenza infection is partly attributable to the emergence of new mutations in the influenza genome. Among the influenza viral gene products, the hemagglutinin (HA) glycoprotein plays a critical role in influenza pathogenesis, is the target for vaccines and accumulates new mutations that may alter the efficacy of immunization. To study the emergence of HA mutations during the course of infection, we employed a deep-targeted sequencing method. We used samples from 17 patients with active H1N1 or H3N2 influenza infections. These patients were not treated with antivirals. In addition, we had samples from five patients who were analyzed longitudinally. Thus, we determined the quantitative changes in the fractional representation of HA mutations during the course of infection. Across individuals in the study, a series of novel HA mutations directly altered the HA coding sequence were identified. Serial viral sampling revealed HA mutations that either were stable, expanded or were reduced in representation during the course of the infection. Overall, we demonstrated the emergence of unique mutations specific to an infected individual and temporal genetic variation during infection.
View details for DOI 10.1038/srep16178
View details for PubMedID 26538451
View details for PubMedCentralID PMC4633648
A programmable method for massively parallel targeted sequencing.
Nucleic acids research
2014; 42 (10)
We have developed a targeted resequencing approach referred to as Oligonucleotide-Selective Sequencing. In this study, we report a series of significant improvements and novel applications of this method whereby the surface of a sequencing flow cell is modified in situ to capture specific genomic regions of interest from a sample and then sequenced. These improvements include a fully automated targeted sequencing platform through the use of a standard Illumina cBot fluidics station. Targeting optimization increased the yield of total on-target sequencing data 2-fold compared to the previous iteration, while simultaneously increasing the percentage of reads that could be mapped to the human genome. The described assays cover up to 1421 genes with a total coverage of 5.5 Megabases (Mb). We demonstrate a 10-fold abundance uniformity of greater than 90% in 1 log distance from the median and a targeting rate of up to 95%. We also sequenced continuous genomic loci up to 1.5 Mb while simultaneously genotyping SNPs and genes. Variants with low minor allele fraction were sensitively detected at levels of 5%. Finally, we determined the exact breakpoint sequence of cancer rearrangements. Overall, this approach has high performance for selective sequencing of genome targets, configuration flexibility and variant calling accuracy.
View details for DOI 10.1093/nar/gku282
View details for PubMedID 24782526
High sensitivity detection and quantitation of DNA copy number and single nucleotide variants with single color droplet digital PCR.
2014; 86 (5): 2618-2624
In this study, we present a highly customizable method for quantifying copy number and point mutations utilizing a single-color, droplet digital PCR platform. Droplet digital polymerase chain reaction (ddPCR) is rapidly replacing real-time quantitative PCR (qRT-PCR) as an efficient method of independent DNA quantification. Compared to quantative PCR, ddPCR eliminates the needs for traditional standards; instead, it measures target and reference DNA within the same well. The applications for ddPCR are widespread including targeted quantitation of genetic aberrations, which is commonly achieved with a two-color fluorescent oligonucleotide probe (TaqMan) design. However, the overall cost and need for optimization can be greatly reduced with an alternative method of distinguishing between target and reference products using the nonspecific DNA binding properties of EvaGreen (EG) dye. By manipulating the length of the target and reference amplicons, we can distinguish between their fluorescent signals and quantify each independently. We demonstrate the effectiveness of this method by examining copy number in the proto-oncogene FLT3 and the common V600E point mutation in BRAF. Using a series of well-characterized control samples and cancer cell lines, we confirmed the accuracy of our method in quantifying mutation percentage and integer value copy number changes. As another novel feature, our assay was able to detect a mutation comprising less than 1% of an otherwise wild-type sample, as well as copy number changes from cancers even in the context of significant dilution with normal DNA. This flexible and cost-effective method of independent DNA quantification proves to be a robust alternative to the commercialized TaqMan assay.
View details for DOI 10.1021/ac403843j
View details for PubMedID 24483992
- Metastatic tumor evolution and organoid modeling implicate TGFBR2 as a cancer driver in diffuse gastric cancer GENOME BIOLOGY 2014; 15 (8)
Systematic genomic identification of colorectal cancer genes delineating advanced from early clinical stage and metastasis.
BMC medical genomics
2013; 6: 54-?
Colorectal cancer is the third leading cause of cancer deaths in the United States. The initial assessment of colorectal cancer involves clinical staging that takes into account the extent of primary tumor invasion, determining the number of lymph nodes with metastatic cancer and the identification of metastatic sites in other organs. Advanced clinical stage indicates metastatic cancer, either in regional lymph nodes or in distant organs. While the genomic and genetic basis of colorectal cancer has been elucidated to some degree, less is known about the identity of specific cancer genes that are associated with advanced clinical stage and metastasis.We compiled multiple genomic data types (mutations, copy number alterations, gene expression and methylation status) as well as clinical meta-data from The Cancer Genome Atlas (TCGA). We used an elastic-net regularized regression method on the combined genomic data to identify genetic aberrations and their associated cancer genes that are indicators of clinical stage. We ranked candidate genes by their regression coefficient and level of support from multiple assay modalities.A fit of the elastic-net regularized regression to 197 samples and integrated analysis of four genomic platforms identified the set of top gene predictors of advanced clinical stage, including: WRN, SYK, DDX5 and ADRA2C. These genetic features were identified robustly in bootstrap resampling analysis.We conducted an analysis integrating multiple genomic features including mutations, copy number alterations, gene expression and methylation. This integrated approach in which one considers all of these genomic features performs better than any individual genomic assay. We identified multiple genes that robustly delineate advanced clinical stage, suggesting their possible role in colorectal cancer metastatic progression.
View details for DOI 10.1186/1755-8794-6-54
View details for PubMedID 24308539
Improving bioinformatic pipelines for exome variant calling
Exome sequencing analysis is a cost-effective approach for identifying variants in coding regions. However, recognizing the relevant single nucleotide variants, small insertions and deletions remains a challenge for many researchers and diagnostic laboratories typically do not have access to the bioinformatic analysis pipelines necessary for clinical application. The Atlas2 suite, recently released by Baylor Genome Center, is designed to be widely accessible, runs on desktop computers but is scalable to computational clusters, and performs comparably with other popular variant callers. Atlas2 may be an accessible alternative for data processing when a rapid solution for variant calling is required.See research article http://www.biomedcentral.com/1471-2105/13/8.
View details for DOI 10.1186/gm306
View details for Web of Science ID 000314564600001
View details for PubMedID 22289516
Ultrasensitive detection of rare mutations using next-generation targeted resequencing
NUCLEIC ACIDS RESEARCH
2012; 40 (1)
With next-generation DNA sequencing technologies, one can interrogate a specific genomic region of interest at very high depth of coverage and identify less prevalent, rare mutations in heterogeneous clinical samples. However, the mutation detection levels are limited by the error rate of the sequencing technology as well as by the availability of variant-calling algorithms with high statistical power and low false positive rates. We demonstrate that we can robustly detect mutations at 0.1% fractional representation. This represents accurate detection of one mutant per every 1000 wild-type alleles. To achieve this sensitive level of mutation detection, we integrate a high accuracy indexing strategy and reference replication for estimating sequencing error variance. We employ a statistical model to estimate the error rate at each position of the reference and to quantify the fraction of variant base in the sample. Our method is highly specific (99%) and sensitive (100%) when applied to a known 0.1% sample fraction admixture of two synthetic DNA samples to validate our method. As a clinical application of this method, we analyzed nine clinical samples of H1N1 influenza A and detected an oseltamivir (antiviral therapy) resistance mutation in the H1N1 neuraminidase gene at a sample fraction of 0.18%.
View details for DOI 10.1093/nar/gkr861
View details for Web of Science ID 000298733500002
View details for PubMedID 22013163
View details for PubMedCentralID PMC3245950
- The Human OligoGenome Resource: a database of oligonucleotide capture probes for resequencing target regions across the human genome NUCLEIC ACIDS RESEARCH 2012; 40 (D1): D1137-D1143
Quantitative and Sensitive Detection of Cancer Genome Amplifications from Formalin Fixed Paraffin Embedded Tumors with Droplet Digital PCR.
Translational medicine (Sunnyvale, Calif.)
2012; 2 (2)
For the analysis of cancer, there is great interest in rapid and accurate detection of cancer genome amplifications containing oncogenes that are potential therapeutic targets. The vast majority of cancer tissue samples are formalin fixed and paraffin embedded (FFPE) which enables histopathological examination and long term archiving. However, FFPE cancer genomic DNA is oftentimes degraded and generally a poor substrate for many molecular biology assays. To overcome the issues of poor DNA quality from FFPE samples and detect oncogenic copy number amplifications with high accuracy and sensitivity, we developed a novel approach. Our assay requires nanogram amounts of genomic DNA, thus facilitating study of small amounts of clinical samples. Using droplet digital PCR (ddPCR), we can determine the relative copy number of specific genomic loci even in the presence of intermingled normal tissue. We used a control dilution series to determine the limits of detection for the ddPCR assay and report its improved sensitivity on minimal amounts of DNA compared to standard real-time PCR. To develop this approach, we designed an assay for the fibroblast growth factor receptor 2 gene (FGFR2) that is amplified in a gastric and breast cancers as well as others. We successfully utilized ddPCR to ascertain FGFR2 amplifications from FFPE-preserved gastrointestinal adenocarcinomas.
View details for PubMedID 23682346
Efficient targeted resequencing of human germline and cancer genomes by oligonucleotide-selective sequencing
2011; 29 (11): 1024-U95
We describe an approach for targeted genome resequencing, called oligonucleotide-selective sequencing (OS-Seq), in which we modify the immobilized lawn of oligonucleotide primers of a next-generation DNA sequencer to function as both a capture and sequencing substrate. We apply OS-Seq to resequence the exons of either 10 or 344 cancer genes from human DNA samples. In our assessment of capture performance, >87% of the captured sequence originated from the intended target region with sequencing coverage falling within a tenfold range for a majority of all targets. Single nucleotide variants (SNVs) called from OS-Seq data agreed with >95% of variants obtained from whole-genome sequencing of the same individual. We also demonstrate mutation discovery from a colorectal cancer tumor sample matched with normal tissue. Overall, we show the robust performance and utility of OS-Seq for the resequencing analysis of human germline and cancer genomes.
View details for DOI 10.1038/nbt.1996
View details for Web of Science ID 000296801300024
View details for PubMedID 22020387
A Flexible Approach for Highly Multiplexed Candidate Gene Targeted Resequencing
2011; 6 (6)
We have developed an integrated strategy for targeted resequencing and analysis of gene subsets from the human exome for variants. Our capture technology is geared towards resequencing gene subsets substantially larger than can be done efficiently with simplex or multiplex PCR but smaller in scale than exome sequencing. We describe all the steps from the initial capture assay to single nucleotide variant (SNV) discovery. The capture methodology uses in-solution 80-mer oligonucleotides. To provide optimal flexibility in choosing human gene targets, we designed an in silico set of oligonucleotides, the Human OligoExome, that covers the gene exons annotated by the Consensus Coding Sequencing Project (CCDS). This resource is openly available as an Internet accessible database where one can download capture oligonucleotides sequences for any CCDS gene and design custom capture assays. Using this resource, we demonstrated the flexibility of this assay by custom designing capture assays ranging from 10 to over 100 gene targets with total capture sizes from over 100 Kilobases to nearly one Megabase. We established a method to reduce capture variability and incorporated indexing schemes to increase sample throughput. Our approach has multiple applications that include but are not limited to population targeted resequencing studies of specific gene subsets, validation of variants discovered in whole genome sequencing surveys and possible diagnostic analysis of disease gene subsets. We also present a cost analysis demonstrating its cost-effectiveness for large population studies.
View details for DOI 10.1371/journal.pone.0021088
View details for Web of Science ID 000292291800008
View details for PubMedID 21738606
View details for PubMedCentralID PMC3127857
Targeted deep resequencing of the human cancer genome using next-generation technologies
BIOTECHNOLOGY AND GENETIC ENGINEERING REVIEWS, VOL 27
2010; 27: 135-158
Next-generation sequencing technologies have revolutionized our ability to identify genetic variants, either germline or somatic point mutations, that occur in cancer. Parallelization and miniaturization of DNA sequencing enables massive data throughput and for the first time, large-scale, nucleotide resolution views of cancer genomes can be achieved. Systematic, large-scale sequencing surveys have revealed that the genetic spectrum of mutations in cancers appears to be highly complex with numerous low frequency bystander somatic variations, and a limited number of common, frequently mutated genes. Large sample sizes and deeper resequencing are much needed in resolving clinical and biological relevance of the mutations as well as in detecting somatic variants in heterogeneous samples and cancer cell sub-populations. However, even with the next-generation sequencing technologies, the overwhelming size of the human genome and need for very high fold coverage represents a major challenge for up-scaling cancer genome sequencing projects. Assays to target, capture, enrich or partition disease-specific regions of the genome offer immediate solutions for reducing the complexity of the sequencing libraries. Integration of targeted DNA capture assays and next-generation deep resequencing improves the ability to identify clinically and biologically relevant mutations.
View details for Web of Science ID 000286179900006
View details for PubMedID 21415896
HOTSPOTS FOR UNSELECTED TY1 TRANSPOSITION EVENTS ON YEAST CHROMOSOME-III ARE NEAR TRANSFER-RNA GENES AND LTR SEQUENCES
1993; 73 (5): 1007-1018
A collection of yeast strains bearing single marked Ty1 insertions on chromosome III was generated. Over 100 such insertions were physically mapped by pulsed-field gel electrophoresis. These insertions are very nonrandomly distributed. Thirty-two such insertions were cloned by the inverted PCR technique, and the flanking DNA sequences were determined. The sequenced insertions all fell within a few very limited regions of chromosome III. Most of these regions contained tRNA coding regions and/or LTRs of preexisting transposable elements. Open reading frames were disrupted at a far lower frequency than expected for random transposition. The results suggest that the Ty1 integration machinery can detect regions of the genome that may represent "safe havens" for insertion. These regions of the genome do not contain any special DNA sequences, nor do they behave as particularly good targets for Ty1 integration in vitro, suggesting that the targeted regions have special properties allowing specific recognition in vivo.
View details for Web of Science ID A1993LF06100016
View details for PubMedID 8388781
Tandem Oligonucleotide Probe Annealing and Elongation To Discriminate Viral Sequence
2017; 89 (8): 4363-4366
New approaches for genomic DNA/RNA detection are in high demand in order to provide controls for existing enzymatic technologies and to create alternatives for emerging applications. In particular, there is an unmet need in rapid, reliable detection of short RNA regions which could open up new opportunities in transcriptome analysis, virology, and other fields. Herein, we report for the first time a "click" chemistry approach to oligonucleotide probe elongation as a novel approach to specifically detect a viral sequence. We hybridized a library of short, terminally labeled probes to Ebola virus RNA followed by click assembly and analysis of the read sequence by various techniques. As we demonstrate in this paper, using our new approach, a viral RNA sequence can be detected in less than 2 h without the need for cDNA synthesis or any other enzymatic reactions and with a sensitivity of <10 pM target RNA.
View details for DOI 10.1021/acs.analchem.7b00646
View details for Web of Science ID 000399858800008
View details for PubMedID 28382823
A genome-wide approach for detecting novel insertion-deletion variants of mid-range size.
Nucleic acids research
2016; 44 (15)
We present SWAN, a statistical framework for robust detection of genomic structural variants in next-generation sequencing data and an analysis of mid-range size insertion and deletions (<10 Kb) for whole genome analysis and DNA mixtures. To identify these mid-range size events, SWAN collectively uses information from read-pair, read-depth and one end mapped reads through statistical likelihoods based on Poisson field models. SWAN also uses soft-clip/split read remapping to supplement the likelihood analysis and determine variant boundaries. The accuracy of SWAN is demonstrated by in silico spike-ins and by identification of known variants in the NA12878 genome. We used SWAN to identify a series of novel set of mid-range insertion/deletion detection that were confirmed by targeted deep re-sequencing. An R package implementation of SWAN is open source and freely available.
View details for DOI 10.1093/nar/gkw481
View details for PubMedID 27325742
View details for PubMedCentralID PMC5009736
- Emergence of Hemagglutinin Mutations During the Course of Influenza Infection SCIENTIFIC REPORTS 2015; 5
- The Cancer Genome Atlas Clinical Explorer: a web and mobile interface for identifying clinical-genomic driver associations GENOME MEDICINE 2015; 7
- Enzyme-Free Detection of Mutations in Cancer DNA Using Synthetic Oligonucleotide Probes and Fluorescence Microscopy PLOS ONE 2015; 10 (8)
Allele-specific copy number profiling by next-generation DNA sequencing.
Nucleic acids research
2015; 43 (4)
The progression and clonal development of tumors often involve amplifications and deletions of genomic DNA. Estimation of allele-specific copy number, which quantifies the number of copies of each allele at each variant loci rather than the total number of chromosome copies, is an important step in the characterization of tumor genomes and the inference of their clonal history. We describe a new method, falcon, for finding somatic allele-specific copy number changes by next generation sequencing of tumors with matched normals. falcon is based on a change-point model on a bivariate mixed Binomial process, which explicitly models the copy numbers of the two chromosome haplotypes and corrects for local allele-specific coverage biases. By using the Binomial distribution rather than a normal approximation, falcon more effectively pools evidence from sites with low coverage. A modified Bayesian information criterion is used to guide model selection for determining the number of copy number events. Falcon is evaluated on in silico spike-in data and applied to the analysis of a pre-malignant colon tumor sample and late-stage colorectal adenocarcinoma from the same individual. The allele-specific copy number estimates obtained by falcon allows us to draw detailed conclusions regarding the clonal history of the individual's colon cancer.
View details for DOI 10.1093/nar/gku1252
View details for PubMedID 25477383
View details for PubMedCentralID PMC4344483
- Allele-specific copy number profiling by next-generation DNA sequencing. Nucleic acids research 2015; 43 (4)
Enzyme-Free Detection of Mutations in Cancer DNA Using Synthetic Oligonucleotide Probes and Fluorescence Microscopy.
2015; 10 (8)
Rapid reliable diagnostics of DNA mutations are highly desirable in research and clinical assays. Current development in this field goes simultaneously in two directions: 1) high-throughput methods, and 2) portable assays. Non-enzymatic approaches are attractive for both types of methods since they would allow rapid and relatively inexpensive detection of nucleic acids. Modern fluorescence microscopy is having a huge impact on detection of biomolecules at previously unachievable resolution. However, no straightforward methods to detect DNA in a non-enzymatic way using fluorescence microscopy and nucleic acid analogues have been proposed so far.Here we report a novel enzyme-free approach to efficiently detect cancer mutations. This assay includes gene-specific target enrichment followed by annealing to oligonucleotides containing locked nucleic acids (LNAs) and finally, detection by fluorescence microscopy. The LNA containing probes display high binding affinity and specificity to DNA containing mutations, which allows for the detection of mutation abundance with an intercalating EvaGreen dye. We used a second probe, which increases the overall number of base pairs in order to produce a higher fluorescence signal by incorporating more dye molecules. Indeed we show here that using EvaGreen dye and LNA probes, genomic DNA containing BRAF V600E mutation could be detected by fluorescence microscopy at low femtomolar concentrations. Notably, this was at least 1000-fold above the potential detection limit.Overall, the novel assay we describe could become a new approach to rapid, reliable and enzyme-free diagnostics of cancer or other associated DNA targets. Importantly, stoichiometry of wild type and mutant targets is conserved in our assay, which allows for an accurate estimation of mutant abundance when the detection limit requirement is met. Using fluorescence microscopy, this approach presents the opportunity to detect DNA at single-molecule resolution and directly in the biological sample of choice.
View details for DOI 10.1371/journal.pone.0136720
View details for PubMedID 26312489
Oncogenic transformation of diverse gastrointestinal tissues in primary organoid culture
2014; 20 (7): 769-777
The application of primary organoid cultures containing epithelial and mesenchymal elements to cancer modeling holds promise for combining the accurate multilineage differentiation and physiology of in vivo systems with the facile in vitro manipulation of transformed cell lines. Here we used a single air-liquid interface culture method without modification to engineer oncogenic mutations into primary epithelial and mesenchymal organoids from mouse colon, stomach and pancreas. Pancreatic and gastric organoids exhibited dysplasia as a result of expression of Kras carrying the G12D mutation (Kras(G12D)), p53 loss or both and readily generated adenocarcinoma after in vivo transplantation. In contrast, primary colon organoids required combinatorial Apc, p53, Kras(G12D) and Smad4 mutations for progressive transformation to invasive adenocarcinoma-like histology in vitro and tumorigenicity in vivo, recapitulating multi-hit models of colorectal cancer (CRC), as compared to the more promiscuous transformation of small intestinal organoids. Colon organoid culture functionally validated the microRNA miR-483 as a dominant driver oncogene at the IGF2 (insulin-like growth factor-2) 11p15.5 CRC amplicon, inducing dysplasia in vitro and tumorigenicity in vivo. These studies demonstrate the general utility of a highly tractable primary organoid system for cancer modeling and driver oncogene validation in diverse gastrointestinal tissues.
View details for DOI 10.1038/nm.3585
View details for Web of Science ID 000338689500021
- A programmable method for massively parallel targeted sequencing. Nucleic acids research 2014; 42 (10)
MendeLIMS: a web-based laboratory information management system for clinical genome sequencing.
2014; 15: 290-?
Large clinical genomics studies using next generation DNA sequencing require the ability to select and track samples from a large population of patients through many experimental steps. With the number of clinical genome sequencing studies increasing, it is critical to maintain adequate laboratory information management systems to manage the thousands of patient samples that are subject to this type of genetic analysis.To meet the needs of clinical population studies using genome sequencing, we developed a web-based laboratory information management system (LIMS) with a flexible configuration that is adaptable to continuously evolving experimental protocols of next generation DNA sequencing technologies. Our system is referred to as MendeLIMS, is easily implemented with open source tools and is also highly configurable and extensible. MendeLIMS has been invaluable in the management of our clinical genome sequencing studies.We maintain a publicly available demonstration version of the application for evaluation purposes at http://mendelims.stanford.edu. MendeLIMS is programmed in Ruby on Rails (RoR) and accesses data stored in SQL-compliant relational databases. Software is freely available for non-commercial use at http://dna-discovery.stanford.edu/software/mendelims/.
View details for DOI 10.1186/1471-2105-15-290
View details for PubMedID 25159034
- MendeLIMS: a web-based laboratory information management system for clinical genome sequencing. BMC bioinformatics 2014; 15 (1): 290-?
Identification of Insertion Deletion Mutations from Deep Targeted Resequencing.
Journal of data mining in genomics & proteomics
2013; 4 (3)
Taking advantage of the deep targeted sequencing capabilities of next generation sequencers, we have developed a novel two step insertion deletion (indel) detection algorithm (IDA) that can determine indels from single read sequences with high computational efficiency and sensitivity when indels are fractionally less compared to wild type reference sequence. First, it identifies candidate indel positions utilizing specific sequence alignment artifacts produced by rapid alignment programs. Second, it confirms the location of the candidate indel by using the Smith-Waterman (SW) algorithm on a restricted subset of Sequence reads. We demonstrate that IDA is applicable to indels of varying sizes from deep targeted sequencing data at low fractions where the indel is diluted by wild type sequence. Our algorithm is useful in detecting indel variants present at variable allelic frequencies such as may occur in heterozygotes and mixed normal-tumor tissue.
View details for PubMedID 24511426
View details for PubMedCentralID PMC3917607
RVD: a command-line program for ultrasensitive rare single nucleotide variant detection using targeted next-generation DNA resequencing.
BMC research notes
2013; 6: 206-?
Rare single nucleotide variants play an important role in genetic diversity and heterogeneity of specific human disease. For example, an individual clinical sample can harbor rare mutations at minor frequencies. Genetic diversity within an individual clinical sample is oftentimes reflected in rare mutations. Therefore, detecting rare variants prior to treatment may prove to be a useful predictor for therapeutic response. Current rare variant detection algorithms using next generation DNA sequencing are limited by inherent sequencing error rate and platform availability.Here we describe an optimized implementation of a rare variant detection algorithm called RVD for use in targeted gene resequencing. RVD is available both as a command-line program and for use in MATLAB and estimates context-specific error using a beta-binomial model to call variants with minor allele frequency (MAF) as low as 0.1%. We show that RVD accepts standard BAM formatted sequence files. We tested RVD analysis on multiple Illumina sequencing platforms, among the most widely used DNA sequencing platforms.RVD meets a growing need for highly sensitive and specific tools for variant detection. To demonstrate the usefulness of RVD, we carried out a thorough analysis of the software's performance on synthetic and clinical virus samples sequenced on both an Illumina GAIIx and a MiSeq. We expect RVD can improve understanding the genetics and treatment of common viral diseases including influenza. RVD is available at the following URL:http://dna-discovery.stanford.edu/software/rvd/.
View details for DOI 10.1186/1756-0500-6-206
View details for PubMedID 23701658
- DETECTING MUTATIONS IN MIXED SAMPLE SEQUENCING DATA USING EMPIRICAL BAYES ANNALS OF APPLIED STATISTICS 2012; 6 (3): 1047-1067
Identification of a novel deletion mutant strain in Saccharomyces cerevisiae that results in a microsatellite instability phenotype.
The DNA mismatch repair (MMR) pathway corrects specific types of DNA replication errors that affect microsatellites and thus is critical for maintaining genomic integrity. The genes of the MMR pathway are highly conserved across different organisms. Likewise, defective MMR function universally results in microsatellite instability (MSI) which is a hallmark of certain types of cancer associated with the Mendelian disorder hereditary nonpolyposis colorectal cancer. (Lynch syndrome). To identify previously unrecognized deleted genes or loci that can lead to MSI, we developed a functional genomics screen utilizing a plasmid containing a microsatellite sequence that is a host spot for MSI mutations and the comprehensive homozygous diploid deletion mutant resource for Saccharomyces cerevisiae. This pool represents a collection of non-essential homozygous yeast diploid (2N) mutants in which there are deletions for over four thousand yeast open reading frames (ORFs). From our screen, we identified a deletion mutant strain of the PAU24 gene that leads to MSI. In a series of validation experiments, we determined that this PAU24 mutant strain had an increased MSI-specific mutation rate in comparison to the original background wildtype strain, other deletion mutants and comparable to a MMR mutant involving the MLH1 gene. Likewise, in yeast strains with a deletion of PAU24, we identified specific de novo indel mutations that occurred within the targeted microsatellite used for this screen.
View details for PubMedID 23667739
The Human OligoGenome Resource: a database of oligonucleotide capture probes for resequencing target regions across the human genome.
Nucleic acids research
2012; 40 (Database issue): D1137-43
Recent exponential growth in the throughput of next-generation DNA sequencing platforms has dramatically spurred the use of accessible and scalable targeted resequencing approaches. This includes candidate region diagnostic resequencing and novel variant validation from whole genome or exome sequencing analysis. We have previously demonstrated that selective genomic circularization is a robust in-solution approach for capturing and resequencing thousands of target human genome loci such as exons and regulatory sequences. To facilitate the design and production of customized capture assays for any given region in the human genome, we developed the Human OligoGenome Resource (http://oligogenome.stanford.edu/). This online database contains over 21 million capture oligonucleotide sequences. It enables one to create customized and highly multiplexed resequencing assays of target regions across the human genome and is not restricted to coding regions. In total, this resource provides 92.1% in silico coverage of the human genome. The online server allows researchers to download a complete repository of oligonucleotide probes and design customized capture assays to target multiple regions throughout the human genome. The website has query tools for selecting and evaluating capture oligonucleotides from specified genomic regions.
View details for DOI 10.1093/nar/gkr973
View details for PubMedID 22102592
- Performance comparison of whole-genome sequencing platforms NATURE BIOTECHNOLOGY 2012; 30 (1): 78-U118
A cross-sample statistical model for SNP detection in short-read sequencing data
NUCLEIC ACIDS RESEARCH
2012; 40 (1)
Highly multiplex DNA sequencers have greatly expanded our ability to survey human genomes for previously unknown single nucleotide polymorphisms (SNPs). However, sequencing and mapping errors, though rare, contribute substantially to the number of false discoveries in current SNP callers. We demonstrate that we can significantly reduce the number of false positive SNP calls by pooling information across samples. Although many studies prepare and sequence multiple samples with the same protocol, most existing SNP callers ignore cross-sample information. In contrast, we propose an empirical Bayes method that uses cross-sample information to learn the error properties of the data. This error information lets us call SNPs with a lower false discovery rate than existing methods.
View details for DOI 10.1093/nar/gkr851
View details for Web of Science ID 000298733500005
View details for PubMedID 22064853
View details for PubMedCentralID PMC3245949
Targeted sequencing library preparation by genomic DNA circularization
For next generation DNA sequencing, we have developed a rapid and simple approach for preparing DNA libraries of targeted DNA content. Current protocols for preparing DNA for next-generation targeted sequencing are labor-intensive, require large amounts of starting material, and are prone to artifacts that result from necessary PCR amplification of sequencing libraries. Typically, sample preparation for targeted NGS is a two-step process where (1) the desired regions are selectively captured and (2) the ends of the DNA molecules are modified to render them compatible with any given NGS sequencing platform.In this proof-of-concept study, we present an integrated approach that combines these two separate steps into one. Our method involves circularization of a specific genomic DNA molecule that directly incorporates the necessary components for conducting sequencing in a single assay and requires only one PCR amplification step. We also show that specific regions of the genome can be targeted and sequenced without any PCR amplification.We anticipate that these rapid targeted libraries will be useful for validation of variants and may have diagnostic application.
View details for DOI 10.1186/1472-6750-11-122
View details for Web of Science ID 000300427900001
View details for PubMedID 22168766
- Genetic-based biomarkers and next-generation sequencing: the future of personalized care in colorectal cancer PERSONALIZED MEDICINE 2011; 8 (3): 331-345
Identification of Novel LNK Mutations In Patients with Chronic Myeloproliferative Neoplasms and Related Disorders
52nd Annual Meeting and Exposition of the American-Society-of-Hematology (ASH)
AMER SOC HEMATOLOGY. 2010: 143–44
View details for Web of Science ID 000289662200316
Detecting simultaneous changepoints in multiple sequences
2010; 97 (3): 631-645
We discuss the detection of local signals that occur at the same location in multiple one-dimensional noisy sequences, with particular attention to relatively weak signals that may occur in only a fraction of the sequences. We propose simple scan and segmentation algorithms based on the sum of the chi-squared statistics for each individual sample, which is equivalent to the generalized likelihood ratio for a model where the errors in each sample are independent. The simple geometry of the statistic allows us to derive accurate analytic approximations to the significance level of such scans. The formulation of the model is motivated by the biological problem of detecting recurrent DNA copy number variants in multiple samples. We show using replicates and parent-child comparisons that pooling data across samples results in more accurate detection of copy number variants. We also apply the multisample segmentation algorithm to the analysis of a cohort of tumour samples containing complex nested and overlapping copy number aberrations, for which our method gives a sparse and intuitive cross-sample summary.
View details for DOI 10.1093/biomet/asq025
View details for Web of Science ID 000280904000008
View details for PubMedCentralID PMC3372242
Oncogenic BRAF Mutation with CDKN2A Inactivation Is Characteristic of a Subset of Pediatric Malignant Astrocytomas
2010; 70 (2): 512-519
Malignant astrocytomas are a deadly solid tumor in children. Limited understanding of their underlying genetic basis has contributed to modest progress in developing more effective therapies. In an effort to identify such alterations, we performed a genome-wide search for DNA copy number aberrations (CNA) in a panel of 33 tumors encompassing grade 1 through grade 4 tumors. Genomic amplifications of 10-fold or greater were restricted to grade 3 and 4 astrocytomas and included the MDM4 (1q32), PDGFRA (4q12), MET (7q21), CMYC (8q24), PVT1 (8q24), WNT5B (12p13), and IGF1R (15q26) genes. Homozygous deletions of CDKN2A (9p21), PTEN (10q26), and TP53 (17p3.1) were evident among grade 2 to 4 tumors. BRAF gene rearrangements that were indicated in three tumors prompted the discovery of KIAA1549-BRAF fusion transcripts expressed in 10 of 10 grade 1 astrocytomas and in none of the grade 2 to 4 tumors. In contrast, an oncogenic missense BRAF mutation (BRAF(V600E)) was detected in 7 of 31 grade 2 to 4 tumors but in none of the grade 1 tumors. BRAF(V600E) mutation seems to define a subset of malignant astrocytomas in children, in which there is frequent concomitant homozygous deletion of CDKN2A (five of seven cases). Taken together, these findings highlight BRAF as a frequent mutation target in pediatric astrocytomas, with distinct types of BRAF alteration occurring in grade 1 versus grade 2 to 4 tumors.
View details for DOI 10.1158/0008-5472.CAN-09-1851
View details for Web of Science ID 000278485500011
View details for PubMedID 20068183
Identification of a biomarker panel using a multiplex proximity ligation assay improves accuracy of pancreatic cancer diagnosis
JOURNAL OF TRANSLATIONAL MEDICINE
Pancreatic cancer continues to prove difficult to clinically diagnose. Multiple simultaneous measurements of plasma biomarkers can increase sensitivity and selectivity of diagnosis. Proximity ligation assay (PLA) is a highly sensitive technique for multiplex detection of biomarkers in plasma with little or no interfering background signal.We examined the plasma levels of 21 biomarkers in a clinically defined cohort of 52 locally advanced (Stage II/III) pancreatic ductal adenocarcinoma cases and 43 age-matched controls using a multiplex proximity ligation assay. The optimal biomarker panel for diagnosis was computed using a combination of the PAM algorithm and logistic regression modeling. Biomarkers that were significantly prognostic for survival in combination were determined using univariate and multivariate Cox survival models.Three markers, CA19-9, OPN and CHI3L1, measured in multiplex were found to have superior sensitivity for pancreatic cancer vs. CA19-9 alone (93% vs. 80%). In addition, we identified two markers, CEA and CA125, that when measured simultaneously have prognostic significance for survival for this clinical stage of pancreatic cancer (p < 0.003).A multiplex panel assaying CA19-9, OPN and CHI3L1 in plasma improves accuracy of pancreatic cancer diagnosis. A panel assaying CEA and CA125 in plasma can predict survival for this clinical cohort of pancreatic cancer patients.
View details for DOI 10.1186/1479-5876-7-105
View details for Web of Science ID 000272889900001
View details for PubMedID 20003342
View details for PubMedCentralID PMC2796647
Molecular inversion probes reveal patterns of 9p21 deletion and copy number aberrations in childhood leukemia
CANCER GENETICS AND CYTOGENETICS
2009; 193 (1): 9-18
Childhood leukemia, which accounts for >30% of newly diagnosed childhood malignancies, is one of the leading causes of death for children with cancer. Genome-wide studies using microarray chips to identify copy number changes in human cancer are becoming more common. In this pilot study, 45 pediatric leukemia samples were analyzed for gene copy aberrations using novel molecular inversion probe (MIP) technology. Acute leukemia subtypes included precursor B-cell acute lymphoblastic leukemia (ALL) (n=23), precursor T-cell ALL (n=6), and acute myeloid leukemia (n=14). The MIP analysis identified 69 regions of recurring copy number changes, of which 41 have not been identified with other DNA microarray platforms. Copy number gains and losses were validated in 98% of clinical karyotypes and 100% of fluorescence in situ hybridization studies available. We report unique patterns of copy number loss in samples with 9p21.3 (CDKN2A) deletion in the precursor B-cell ALL patients, compared with the precursor T-cell ALL patients. MIPs represent an attractive technology for identifying novel copy number aberrations, validating previously reported copy number changes, and translating molecular findings into clinically relevant targets for further investigation.
View details for DOI 10.1016/j.cancergencyto.2009.03.005
View details for Web of Science ID 000268922900002
View details for PubMedID 19602459
View details for PubMedCentralID PMC2776674
Disperse-a software system for design of selector probes for exon resequencing applications
2009; 25 (5): 666-667
Selector probes enable the amplification of many selected regions of the genome in multiplex. Disperse is a software pipeline that automates the procedure of designing selector probes for exon resequencing applications.Software and documentation is available at http://bioinformatics.org/disperse
View details for DOI 10.1093/bioinformatics/btp001
View details for Web of Science ID 000263834600018
View details for PubMedID 19158162
Molecular inversion probe assay for allelic quantitation.
Methods in molecular biology (Clifton, N.J.)
2009; 556: 67-87
Molecular inversion probe (MIP) technology has been demonstrated to be a robust platform for large-scale dual genotyping and copy number analysis. Applications in human genomic and genetic studies include the possibility of running dual germline genotyping and combined copy number variation ascertainment. MIPs analyze large numbers of specific genetic target sequences in parallel, relying on interrogation of a barcode tag, rather than direct hybridization of genomic DNA to an array. The MIP approach does not replace, but is complementary to many of the copy number technologies being performed today. Some specific advantages of MIP technology include: less DNA required (37 ng vs. 250 ng), DNA quality less important, more dynamic range (amplifications detected up to copy number 60), allele-specific information "cleaner" (less SNP cross-talk/contamination), and quality of markers better (fewer individual MIPs versus SNPs needed to identify copy number changes). MIPs can be considered a candidate gene (targeted whole genome) approach and can find specific areas of interest that otherwise may be missed with other methods.
View details for DOI 10.1007/978-1-60327-192-9_6
View details for PubMedID 19488872
Next-generation DNA sequencing
2008; 26 (10): 1135-1145
DNA sequence represents a single format onto which a broad range of biological phenomena can be projected for high-throughput data collection. Over the past three years, massively parallel DNA sequencing platforms have become widely available, reducing the cost of DNA sequencing by over two orders of magnitude, and democratizing the field by putting the sequencing capacity of a major genome center in the hands of individual investigators. These new technologies are rapidly evolving, and near-term challenges include the development of robust protocols for generating sequencing libraries, building effective new approaches to data-analysis, and often a rethinking of experimental design. Next-generation DNA sequencing has the potential to dramatically accelerate biological and biomedical research, by enabling the comprehensive analysis of genomes, transcriptomes and interactomes to become inexpensive, routine and widespread, rather than requiring significant production-scale efforts.
View details for DOI 10.1038/nbt1486
View details for Web of Science ID 000259926000028
View details for PubMedID 18846087
Gene-specific delineation of copy number aberrations in follicular lymphoma with molecular inversion probes
49th Annual Meeting of the American-Society-of-Hematology
AMER SOC HEMATOLOGY. 2007: 766A–767A
View details for Web of Science ID 000251100803385
Multigene amplification and massively parallel sequencing for cancer mutation discovery
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA
2007; 104 (22): 9387-9392
We have developed a procedure for massively parallel resequencing of multiple human genes by combining a highly multiplexed and target-specific amplification process with a high-throughput parallel sequencing technology. The amplification process is based on oligonucleotide constructs, called selectors, that guide the circularization of specific DNA target regions. Subsequently, the circularized target sequences are amplified in multiplex and analyzed by using a highly parallel sequencing-by-synthesis technology. As a proof-of-concept study, we demonstrate parallel resequencing of 10 cancer genes covering 177 exons with average sequence coverage per sample of 93%. Seven cancer cell lines and one normal genomic DNA sample were studied with multiple mutations and polymorphisms identified among the 10 genes. Mutations and polymorphisms in the TP53 gene were confirmed by traditional sequencing.
View details for DOI 10.1073/pnas.0702165104
View details for Web of Science ID 000246935700055
View details for PubMedID 17517648
Multiplex amplification of all coding sequences within 10 cancer genes by Gene-Collector
NUCLEIC ACIDS RESEARCH
2007; 35 (7)
Herein we present Gene-Collector, a method for multiplex amplification of nucleic acids. The procedure has been employed to successfully amplify the coding sequence of 10 human cancer genes in one assay with uniform abundance of the final products. Amplification is initiated by a multiplex PCR in this case with 170 primer pairs. Each PCR product is then specifically circularized by ligation on a Collector probe capable of juxtapositioning only the perfectly matched cognate primer pairs. Any amplification artifacts typically associated with multiplex PCR derived from the use of many primer pairs such as false amplicons, primer-dimers etc. are not circularized and degraded by exonuclease treatment. Circular DNA molecules are then further enriched by randomly primed rolling circle replication. Amplification was successful for 90% of the targeted amplicons as seen by hybridization to a custom resequencing DNA micro-array. Real-time quantitative PCR revealed that 96% of the amplification products were all within 4-fold of the average abundance. Gene-Collector has utility for numerous applications such as high throughput resequencing, SNP analyses, and pathogen detection.
View details for DOI 10.1093/nar/gkm078
View details for Web of Science ID 000246294700001
View details for PubMedID 17317684
Multiplexed protein detection by proximity ligation for cancer biomarker validation
2007; 4 (4): 327-329
We present a proximity ligation-based multiplexed protein detection procedure in which several selected proteins can be detected via unique nucleic-acid identifiers and subsequently quantified by real-time PCR. The assay requires a 1-microl sample, has low-femtomolar sensitivity as well as five-log linear range and allows for modular multiplexing without cross-reactivity. The procedure can use a single polyclonal antibody batch for each target protein, simplifying affinity-reagent creation for new biomarker candidates.
View details for DOI 10.1038/NMETH1020
View details for Web of Science ID 000245584900013
View details for PubMedID 17369836
Under-expression of Kalirin-7 increases iNOS activity in cultured cells and correlates to elevated iNOS activity in Alzheimer's disease hippocampus
JOURNAL OF ALZHEIMERS DISEASE
2007; 12 (3): 271-281
Recently, it has been reported that Kalirin gene transcripts are under-expressed in AD hippocampal specimens compared to the controls. The Kalirin gene generates a dozen Kalirin isoforms. Kalirin-7 is the predominant protein expressed in the adult brain and plays crucial roles in growth and maintenance of neurons. Yet its role in human diseases is unknown. We report that Kalirin-7 is significantly diminished both at the mRNA and protein levels in the hippocampus specimens from 19 AD patients compared to the specimens from 15 controls. Kalirin-7 associates with iNOS in the hippocampus, and therefore, Kalirin-7 is complexed with iNOS less in AD hippocampus extracts than in control hippocampus extracts. In cultured cells, Kalirin-7 associates with iNOS and down-regulates the enzyme activity. The down-regulation is attributed to the highly conserved 33 amino acid sequence, K(617) -H(649), of the 1,663 amino acids long Kalirin-7. Remarkably, the iNOS activity is considerably higher in the hippocampus specimens from AD patients than the specimens from 15 controls. These observations suggest that the under-expression of Kalirin-7 in AD hippocampus correlates to the elevated iNOS activity.
View details for Web of Science ID 000252300000009
View details for PubMedID 18057561
- Reproducibility Probability Score - incorporating measurement variability across laboratories for gene selection NATURE BIOTECHNOLOGY 2006; 24 (12): 1476-1477
- Data quality in genomics and microarrays NATURE BIOTECHNOLOGY 2006; 24 (9): 1112-1113
The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements
2006; 24 (9): 1151-1161
Over the last decade, the introduction of microarray technology has had a profound impact on gene expression research. The publication of studies with dissimilar or altogether contradictory results, obtained using different microarray platforms to analyze identical RNA samples, has raised concerns about the reliability of this technology. The MicroArray Quality Control (MAQC) project was initiated to address these concerns, as well as other performance and data analysis issues. Expression data on four titration pools from two distinct reference RNA samples were generated at multiple test sites using a variety of microarray-based and alternative technology platforms. Here we describe the experimental design and probe mapping efforts behind the MAQC project. We show intraplatform consistency across test sites as well as a high level of interplatform concordance in terms of genes identified as differentially expressed. This study provides a resource that represents an important first step toward establishing a framework for the use of microarrays in clinical and regulatory settings.
View details for DOI 10.1038/nbt1239
View details for Web of Science ID 000240495200036
View details for PubMedID 16964229
View details for PubMedCentralID PMC3272078
Molecular inversion probe analysis of gene copy alterations reveals distinct categories of colorectal carcinoma
2006; 66 (16): 7910-7919
Genomic instability is a major feature of neoplastic development in colorectal carcinoma and other cancers. Specific genomic instability events, such as deletions in chromosomes and other alterations in gene copy number, have potential utility as biologically relevant prognostic biomarkers. For example, genomic deletions on chromosome arm 18q are an indicator of colorectal carcinoma behavior and potentially useful as a prognostic indicator. Adapting a novel genomic technology called molecular inversion probes which can determine gene copy alterations, such as genomic deletions, we designed a set of probes to interrogate several hundred individual exons of >200 cancer genes with an overall distribution covering all chromosome arms. In addition, >100 probes were designed in close proximity of microsatellite markers on chromosome arm 18q. We analyzed a set of colorectal carcinoma cell lines and primary colorectal tumor samples for gene copy alterations and deletion mutations in exons. Based on clustering analysis, we distinguished the different categories of genomic instability among the colorectal cancer cell lines. Our analysis of primary tumors uncovered several distinct categories of colorectal carcinoma, each with specific patterns of 18q deletions and deletion mutations in specific genes. This finding has potential clinical ramifications given the application of 18q loss of heterozygosity events as a potential indicator for adjuvant treatment in stage II colorectal carcinoma.
View details for DOI 10.1158/0008-5472.CAN-06-0595
View details for Web of Science ID 000239828200013
View details for PubMedID 16912164
A functional assay for mutations in tumor suppressor genes caused by mismatch repair deficiency
HUMAN MOLECULAR GENETICS
2001; 10 (24): 2737-2743
The coding sequences of multiple human tumor suppressor genes include microsatellite sequences that are prone to mutations. Saccharomyces cerevisiae strains deficient in DNA mismatch repair (MMR) can be used to determine de novo mutation rates of these human tumor suppressor genes as well as any other gene sequence. Microsatellites in human TGFBR2, PTEN and APC genes were placed in yeast vectors and analyzed in isogenic yeast strains that were wild-type or deletion mutants for MSH2 or MLH1. In MMR-deficient strains, the vector containing the (A)(10) microsatellite sequence of TGFBR2 had a mutation rate (mutations/cell division) of 1.4 x 10(-4), compared to a mutation rate of 1.7 x 10(-6) in the wild-type strain. In MMR-deficient strains, mutation rates in PTEN and APC were also elevated above background levels. PTEN mutation rates were higher in both msh2 (4.4 x 10-5) and mlh1 strains (2.3 x 10-5). APC mutation rates in the msh2 strain (2.4 x 10-6) and the mlh1 strain (1.7 x 10-6) were also significantly, but less dramatically, elevated over background. Mutations selected for in the yeast screen were identical to those previously observed in human tumor samples with microsatellite instability (MSI). This functional assay has applicability in providing quantitative data about microsatellite mutation rates caused by MMR deficiency in any human tumor suppressor gene sequence. It can also be applied as a genetic screen to identify new genes that are vulnerable to such microsatellite mutations and thus may be involved in the neoplastic development of tumors with MSI.
View details for Web of Science ID 000172867500001
View details for PubMedID 11734538
Spondyloepimetaphyseal dysplasia with joint laxity (SEMDJL): Presentation in two unrelated patients in the United States
AMERICAN JOURNAL OF MEDICAL GENETICS
1999; 86 (3): 245-252
This is a report of two North American patients with spondyloepimetaphyseal dysplasia with joint laxity, an uncommon autosomal recessive skeletal dysplasia rarely reported outside of South Africa. Patients with SEMDJL have vertebral abnormalities and ligamentous laxity that results in spinal misalignment and progressive severe kyphoscoliosis, thoracic asymmetry, and respiratory compromise resulting in early death. Nonaxial skeletal involvement includes elbow deformities with radial head dislocation, dislocated hips, clubbed feet, and tapered fingers with spatulate distal phalanges. Many affected children have an oval face, flat midface, prominent eyes with blue sclerae, and a long philtrum. Palatal abnormalities and congenital heart disease are also observed. Diagnosis in infancy may be difficult because many of the typical findings are not apparent early and only evolve over time. We review the physical and radiographic findings in two unrelated patients with this disorder in order to increase the awareness of this disorder, particularly for clinicians outside of South Africa.
View details for Web of Science ID 000082714300010
View details for PubMedID 10482874
- Molecular classification of the inherited hamartoma polyposis syndromes: Clearing the muddied waters AMERICAN JOURNAL OF HUMAN GENETICS 1998; 62 (5): 1020-1022
Inherited mutations in PTEN that are associated with breast cancer, Cowden disease, and juvenile polyposis
AMERICAN JOURNAL OF HUMAN GENETICS
1997; 61 (6): 1254-1260
PTEN, a protein tyrosine phosphatase with homology to tensin, is a tumor-suppressor gene on chromosome 10q23. Somatic mutations in PTEN occur in multiple tumors, most markedly glioblastomas. Germ-line mutations in PTEN are responsible for Cowden disease (CD), a rare autosomal dominant multiple-hamartoma syndrome. PTEN was sequenced from constitutional DNA from 25 families. Germ-line PTEN mutations were detected in all of five families with both breast cancer and CD, in one family with juvenile polyposis syndrome, and in one of four families with breast and thyroid tumors. In this last case, signs of CD were subtle and were diagnosed only in the context of mutation analysis. PTEN mutations were not detected in 13 families at high risk of breast and/or ovarian cancer. No PTEN-coding-sequence polymorphisms were detected in 70 independent chromosomes. Seven PTEN germ-line mutations occurred, five nonsense and two missense mutations, in six of nine PTEN exons. The wild-type PTEN allele was lost from renal, uterine, breast, and thyroid tumors from a single patient. Loss of PTEN expression was an early event, reflected in loss of the wild-type allele in DNA from normal tissue adjacent to the breast and thyroid tumors. In RNA from normal tissues from three families, mutant transcripts appeared unstable. Germ-line PTEN mutations predispose to breast cancer in association with CD, although the signs of CD may be subtle.
View details for Web of Science ID 000071555900007
View details for PubMedID 9399897