Instructor, Medicine - Oncology
Honors & Awards
K99/R00 Pathway to Independence Award, NIH/NCI (2015 - Pres)
Visionary Postdoctoral Fellowship, Dept. of Defense (2012 - 2015)
NIH T32 Cancer Biology Training Grant, Stanford University (2012)
PhD, University of California, Santa Barbara, Biomolecular Science and Engineering Program (2010)
Integrated digital error suppression for improved detection of circulating tumor DNA
View details for DOI 10.1038/nbt.3520
Robust enumeration of cell subsets from tissue expression profiles
View details for DOI 10.1038/nmeth.3337
The prognostic landscape of genes and infiltrating immune cells across human cancers
View details for DOI 10.1038/nm.3909
- An ultrasensitive method for quantitating circulating tumor DNA with broad patient coverage NATURE MEDICINE 2014; 20 (5): 552-558
FACTERA: a practical method for the discovery of genomic rearrangements at breakpoint resolution
View details for DOI 10.1093/bioinformatics/btu549
Identification of a colonial chordate histocompatibility gene
2013; 341 (6144): 384-387
View details for DOI 10.1126/science.1238036
Lab-Specific Gene Expression Signatures in Pluripotent Stem Cells
CELL STEM CELL
2010; 7 (2): 258-262
Pluripotent stem cells derived from both embryonic and reprogrammed somatic cells have significant potential for human regenerative medicine. Despite similarities in developmental potential, however, several groups have found fundamental differences between embryonic stem cell (ESC) and induced-pluripotent stem cell (iPSC) lines that may have important implications for iPSC-based medical therapies. Using an unsupervised clustering algorithm, we further studied the genetic homogeneity of iPSC and ESC lines by reanalyzing microarray gene expression data from seven different laboratories. Unexpectedly, this analysis revealed a strong correlation between gene expression signatures and specific laboratories in both ESC and iPSC lines. Nearly one-third of the genes with lab-specific expression signatures are also differentially expressed between ESCs and iPSCs. These data are consistent with the hypothesis that in vitro microenvironmental context differentially impacts the gene expression signatures of both iPSCs and ESCs.
View details for DOI 10.1016/j.stem.2010.06.016
View details for Web of Science ID 000281107400017
View details for PubMedID 20682451
Identification of tumorigenic cells and therapeutic targets in pancreatic neuroendocrine tumors.
Proceedings of the National Academy of Sciences of the United States of America
Pancreatic neuroendocrine tumors (PanNETs) are a type of pancreatic cancer with limited therapeutic options. Consequently, most patients with advanced disease die from tumor progression. Current evidence indicates that a subset of cancer cells is responsible for tumor development, metastasis, and recurrence, and targeting these tumor-initiating cells is necessary to eradicate tumors. However, tumor-initiating cells and the biological processes that promote pathogenesis remain largely uncharacterized in PanNETs. Here we profile primary and metastatic tumors from an index patient and demonstrate that MET proto-oncogene activation is important for tumor growth in PanNET xenograft models. We identify a highly tumorigenic cell population within several independent surgically acquired PanNETs characterized by increased cell-surface protein CD90 expression and aldehyde dehydrogenase A1 (ALDHA1) activity, and provide in vitro and in vivo evidence for their stem-like properties. We performed proteomic profiling of 332 antigens in two cell lines and four primary tumors, and showed that CD47, a cell-surface protein that acts as a "don't eat me" signal co-opted by cancers to evade innate immune surveillance, is ubiquitously expressed. Moreover, CD47 coexpresses with MET and is enriched in CD90(hi)cells. Furthermore, blocking CD47 signaling promotes engulfment of tumor cells by macrophages in vitro and inhibits xenograft tumor growth, prevents metastases, and prolongs survival in vivo.
View details for DOI 10.1073/pnas.1600007113
View details for PubMedID 27035983
Circulating tumour DNA profiling reveals heterogeneity of EGFR inhibitor resistance mechanisms in lung cancer patients.
2016; 7: 11815
Circulating tumour DNA (ctDNA) analysis facilitates studies of tumour heterogeneity. Here we employ CAPP-Seq ctDNA analysis to study resistance mechanisms in 43 non-small cell lung cancer (NSCLC) patients treated with the third-generation epidermal growth factor receptor (EGFR) inhibitor rociletinib. We observe multiple resistance mechanisms in 46% of patients after treatment with first-line inhibitors, indicating frequent intra-patient heterogeneity. Rociletinib resistance recurrently involves MET, EGFR, PIK3CA, ERRB2, KRAS and RB1. We describe a novel EGFR L798I mutation and find that EGFR C797S, which arises in ∼33% of patients after osimertinib treatment, occurs in <3% after rociletinib. Increased MET copy number is the most frequent rociletinib resistance mechanism in this cohort and patients with multiple pre-existing mechanisms (T790M and MET) experience inferior responses. Similarly, rociletinib-resistant xenografts develop MET amplification that can be overcome with the MET inhibitor crizotinib. These results underscore the importance of tumour heterogeneity in NSCLC and the utility of ctDNA-based resistance mechanism assessment.
View details for DOI 10.1038/ncomms11815
View details for PubMedID 27283993
High-throughput genomic profiling of tumor-infiltrating leukocytes.
Current opinion in immunology
2016; 41: 77-84
Tumors are complex ecosystems comprised of diverse cell types including malignant cells, mesenchymal cells, and tumor-infiltrating leukocytes (TILs). While TILs are well known to play important roles in many aspects of cancer biology, recent developments in immuno-oncology have spurred considerable interest in TILs, particularly in relation to their optimal engagement by emerging immunotherapies. Traditionally, the enumeration of TIL phenotypic diversity and composition in solid tumors has relied on resolving single cells by flow cytometry and immunohistochemical methods. However, advances in genome-wide technologies and computational methods are now allowing TILs to be profiled with increasingly high resolution and accuracy directly from RNA mixtures of bulk tumor samples. In this review, we highlight recent progress in the development of in silico tumor dissection methods, and illustrate examples of how these strategies can be applied to characterize TILs in human tumors to facilitate personalized cancer therapy.
View details for DOI 10.1016/j.coi.2016.06.006
View details for PubMedID 27372732
Skin fibrosis. Identification and isolation of a dermal lineage with intrinsic fibrogenic potential.
2015; 348 (6232)
Dermal fibroblasts represent a heterogeneous population of cells with diverse features that remain largely undefined. We reveal the presence of at least two fibroblast lineages in murine dorsal skin. Lineage tracing and transplantation assays demonstrate that a single fibroblast lineage is responsible for the bulk of connective tissue deposition during embryonic development, cutaneous wound healing, radiation fibrosis, and cancer stroma formation. Lineage-specific cell ablation leads to diminished connective tissue deposition in wounds and reduces melanoma growth. Using flow cytometry, we identify CD26/DPP4 as a surface marker that allows isolation of this lineage. Small molecule-based inhibition of CD26/DPP4 enzymatic activity during wound healing results in diminished cutaneous scarring. Identification and isolation of these lineages hold promise for translational medicine aimed at in vivo modulation of fibrogenic behavior.
View details for DOI 10.1126/science.aaa2151
View details for PubMedID 25883361
Large-Scale and Comprehensive Immune Profiling and Functional Analysis of Normal Human Aging.
2015; 10 (7)
While many age-associated immune changes have been reported, a comprehensive set of metrics of immune aging is lacking. Here we report data from 243 healthy adults aged 40-97, for whom we measured clinical and functional parameters, serum cytokines, cytokines and gene expression in stimulated and unstimulated PBMC, PBMC phenotypes, and cytokine-stimulated pSTAT signaling in whole blood. Although highly heterogeneous across individuals, many of these assays revealed trends by age, sex, and CMV status, to greater or lesser degrees. Age, then sex and CMV status, showed the greatest impact on the immune system, as measured by the percentage of assay readouts with significant differences. An elastic net regression model could optimally predict age with 14 analytes from different assays. This reinforces the importance of multivariate analysis for defining a healthy immune system. These data provide a reference for others measuring immune parameters in older people.
View details for DOI 10.1371/journal.pone.0133627
View details for PubMedID 26197454
Potential clinical utility of ultrasensitive circulating tumor DNA detection with CAPP-Seq.
Expert review of molecular diagnostics
Tumors continually shed DNA into the circulation, where it can be noninvasively accessed. The ability to accurately detect circulating tumor DNA (ctDNA) could significantly impact the management of patients with nearly every cancer type. Quantitation of ctDNA could allow objective response assessment, detection of minimal residual disease and noninvasive tumor genotyping. The latter application overcomes the barriers currently limiting repeated tumor tissue sampling during therapy. Recent technical advancements have improved upon the sensitivity, specificity and feasibility of ctDNA detection and promise to enable innovative clinical applications. Here, we focus on the potential clinical utility of ctDNA analysis using CAncer Personalized Profiling by deep Sequencing (CAPP-Seq), a novel next-generation sequencing-based approach for ultrasensitive ctDNA detection. Applications of CAPP-Seq for the personalization of cancer detection and therapy are discussed.
View details for DOI 10.1586/14737159.2015.1019476
View details for PubMedID 25773944
In Vivo Clonal Analysis Reveals Lineage-Restricted Progenitor Characteristics in Mammalian Kidney Development, Maintenance, and Regeneration
2014; 7 (4): 1270-1283
The mechanism and magnitude by which the mammalian kidney generates and maintains its proximal tubules, distal tubules, and collecting ducts remain controversial. Here, we use long-term in vivo genetic lineage tracing and clonal analysis of individual cells from kidneys undergoing development, maintenance, and regeneration. We show that the adult mammalian kidney undergoes continuous tubulogenesis via expansions of fate-restricted clones. Kidneys recovering from damage undergo tubulogenesis through expansions of clones with segment-specific borders, and renal spheres developing in vitro from individual cells maintain distinct, segment-specific fates. Analysis of mice derived by transfer of color-marked embryonic stem cells (ESCs) into uncolored blastocysts demonstrates that nephrons are polyclonal, developing from expansions of singly fated clones. Finally, we show that adult renal clones are derived from Wnt-responsive precursors, and their tracing in vivo generates tubules that are segment specific. Collectively, these analyses demonstrate that fate-restricted precursors functioning as unipotent progenitors continuously maintain and self-preserve the mouse kidney throughout life.
View details for DOI 10.1016/j.celrep.2014.04.018
View details for Web of Science ID 000336495700033
View details for PubMedID 24835991
Efficient Selection of Biomineralizing DNA Aptamers Using Deep Sequencing and Population Clustering
2014; 8 (1): 387-395
View details for DOI 10.1021/nn404448s
Identifying Stem Cell Gene Expression Patterns and Phenotypic Networks with AutoSOME.
Methods in molecular biology (Clifton, N.J.)
2014; 1150: 115-130
Stem cells have the unique property of differentiation and self-renewal and play critical roles in normal development, tissue repair, and disease. To promote systems-wide analysis of cells and tissues, we developed AutoSOME, a machine-learning method for identifying coordinated gene expression patterns and correlated cellular phenotypes in whole-transcriptome data, without prior knowledge of cluster number or structure. Here, we present a facile primer demonstrating the use of AutoSOME for identification and characterization of stem cell gene expression signatures and for visualization of transcriptome networks using Cytoscape. This protocol should serve as a general foundation for gene expression cluster analysis of stem cells, with applications for studying pluripotency, multi-lineage potential, and neoplastic disease.
View details for DOI 10.1007/978-1-4939-0512-6_6
View details for PubMedID 24743993
The genome sequence of the colonial chordate, Botryllus schlosseri.
Botryllus schlosseri is a colonial urochordate that follows the chordate plan of development following sexual reproduction, but invokes a stem cell-mediated budding program during subsequent rounds of asexual reproduction. As urochordates are considered to be the closest living invertebrate relatives of vertebrates, they are ideal subjects for whole genome sequence analyses. Using a novel method for high-throughput sequencing of eukaryotic genomes, we sequenced and assembled 580 Mbp of the B. schlosseri genome. The genome assembly is comprised of nearly 14,000 intron-containing predicted genes, and 13,500 intron-less predicted genes, 40% of which could be confidently parceled into 13 (of 16 haploid) chromosomes. A comparison of homologous genes between B. schlosseri and other diverse taxonomic groups revealed genomic events underlying the evolution of vertebrates and lymphoid-mediated immunity. The B. schlosseri genome is a community resource for studying alternative modes of reproduction, natural transplantation reactions, and stem cell-mediated regeneration. DOI:http://dx.doi.org/10.7554/eLife.00569.001.
View details for DOI 10.7554/eLife.00569
View details for PubMedID 23840927
Systems-level analysis of age-related macular degeneration reveals global biomarkers and phenotype-specific functional networks
Please see related commentary: http://www.biomedcentral.com/1741-7015/10/21/abstractAge-related macular degeneration (AMD) is a leading cause of blindness that affects the central region of the retinal pigmented epithelium (RPE), choroid, and neural retina. Initially characterized by an accumulation of sub-RPE deposits, AMD leads to progressive retinal degeneration, and in advanced cases, irreversible vision loss. Although genetic analysis, animal models, and cell culture systems have yielded important insights into AMD, the molecular pathways underlying AMD's onset and progression remain poorly delineated. We sought to better understand the molecular underpinnings of this devastating disease by performing the first comparative transcriptome analysis of AMD and normal human donor eyes.RPE-choroid and retina tissue samples were obtained from a common cohort of 31 normal, 26 AMD, and 11 potential pre-AMD human donor eyes. Transcriptome profiles were generated for macular and extramacular regions, and statistical and bioinformatic methods were employed to identify disease-associated gene signatures and functionally enriched protein association networks. Selected genes of high significance were validated using an independent donor cohort.We identified over 50 annotated genes enriched in cell-mediated immune responses that are globally over-expressed in RPE-choroid AMD phenotypes. Using a machine learning model and a second donor cohort, we show that the top 20 global genes are predictive of AMD clinical diagnosis. We also discovered functionally enriched gene sets in the RPE-choroid that delineate the advanced AMD phenotypes, neovascular AMD and geographic atrophy. Moreover, we identified a graded increase of transcript levels in the retina related to wound response, complement cascade, and neurogenesis that strongly correlates with decreased levels of phototransduction transcripts and increased AMD severity. Based on our findings, we assembled protein-protein interactomes that highlight functional networks likely to be involved in AMD pathogenesis.We discovered new global biomarkers and gene expression signatures of AMD. These results are consistent with a model whereby cell-based inflammatory responses represent a central feature of AMD etiology, and depending on genetics, environment, or stochastic factors, may give rise to the advanced AMD phenotypes characterized by angiogenesis and/or cell death. Genes regulating these immunological activities, along with numerous other genes identified here, represent promising new targets for AMD-directed therapeutics and diagnostics.
View details for DOI 10.1186/PREACCEPT-1418491035586234
View details for Web of Science ID 000314566500002
View details for PubMedID 22364233
A proteomic approach for the identification of novel lysine methyltransferase substrates
EPIGENETICS & CHROMATIN
Signaling via protein lysine methylation has been proposed to play a central role in the regulation of many physiologic and pathologic programs. In contrast to other post-translational modifications such as phosphorylation, proteome-wide approaches to investigate lysine methylation networks do not exist.In the current study, we used the ProtoArray® platform, containing over 9,500 human proteins, and developed and optimized a system for proteome-wide identification of novel methylation events catalyzed by the protein lysine methyltransferase (PKMT) SETD6. This enzyme had previously been shown to methylate the transcription factor RelA, but it was not known whether SETD6 had other substrates. By using two independent detection approaches, we identified novel candidate substrates for SETD6, and verified that all targets tested in vitro and in cells were genuine substrates.We describe a novel proteome-wide methodology for the identification of new PKMT substrates. This technological advance may lead to a better understanding of the enzymatic activity and substrate specificity of the large number (more than 50) PKMTs present in the human proteome, most of which are uncharacterized.
View details for DOI 10.1186/1756-8935-4-19
View details for Web of Science ID 000296832600001
View details for PubMedID 22024134
Global Analysis of Proline-Rich Tandem Repeat Proteins Reveals Broad Phylogenetic Diversity in Plant Secretomes
2011; 6 (8)
Cell walls, constructed by precisely choreographed changes in the plant secretome, play critical roles in plant cell physiology and development. Along with structural polysaccharides, secreted proline-rich Tandem Repeat Proteins (TRPs) are important for cell wall function, yet the evolutionary diversity of these structural TRPs remains virtually unexplored. Using a systems-level computational approach to analyze taxonomically diverse plant sequence data, we identified 31 distinct Pro-rich TRP classes targeted for secretion. This analysis expands upon the known phylogenetic diversity of extensins, the most widely studied class of wall structural proteins, and demonstrates that extensins evolved before plant vascularization. Our results also show that most Pro-rich TRP classes have unexpectedly restricted evolutionary distributions, revealing considerable differences in plant secretome signatures that define unexplored diversity.
View details for DOI 10.1371/journal.pone.0023167
View details for Web of Science ID 000293511900032
View details for PubMedID 21829715
clusterMaker: a multi-algorithm clustering plugin for Cytoscape
2011; 12: 436
View details for DOI 10.1186/1471-2105-12-436
AutoSOME: a clustering method for identifying gene expression modules without prior knowledge of cluster number
Clustering the information content of large high-dimensional gene expression datasets has widespread application in "omics" biology. Unfortunately, the underlying structure of these natural datasets is often fuzzy, and the computational identification of data clusters generally requires knowledge about cluster number and geometry.We integrated strategies from machine learning, cartography, and graph theory into a new informatics method for automatically clustering self-organizing map ensembles of high-dimensional data. Our new method, called AutoSOME, readily identifies discrete and fuzzy data clusters without prior knowledge of cluster number or structure in diverse datasets including whole genome microarray data. Visualization of AutoSOME output using network diagrams and differential heat maps reveals unexpected variation among well-characterized cancer cell lines. Co-expression analysis of data from human embryonic and induced pluripotent stem cells using AutoSOME identifies >3400 up-regulated genes associated with pluripotency, and indicates that a recently identified protein-protein interaction network characterizing pluripotency was underestimated by a factor of four.By effectively extracting important information from high-dimensional microarray data without prior knowledge or the need for data filtration, AutoSOME can yield systems-level insights from whole genome microarray expression studies. Due to its generality, this new method should also have practical utility for a variety of data-intensive applications, including the results of deep sequencing experiments. AutoSOME is available for download at http://jimcooperlab.mcdb.ucsb.edu/autosome webcite.
View details for DOI 10.1186/1471-2105-11-117
View details for Web of Science ID 000276296100002
View details for PubMedID 20202218
XSTREAM: A practical algorithm for identification and architecture modeling of tandem repeats in protein sequences
Biological sequence repeats arranged in tandem patterns are widespread in DNA and proteins. While many software tools have been designed to detect DNA tandem repeats (TRs), useful algorithms for identifying protein TRs with varied levels of degeneracy are still needed.To address limitations of current repeat identification methods, and to provide an efficient and flexible algorithm for the detection and analysis of TRs in protein sequences, we designed and implemented a new computational method called XSTREAM. Running time tests confirm the practicality of XSTREAM for analyses of multi-genome datasets. Each of the key capabilities of XSTREAM (e.g., merging, nesting, long-period detection, and TR architecture modeling) are demonstrated using anecdotal examples, and the utility of XSTREAM for identifying TR proteins was validated using data from a recently published paper.We show that XSTREAM is a practical and valuable tool for TR detection in protein and nucleotide sequences at the multi-genome scale, and an effective tool for modeling TR domains with diverse architectures and varied levels of degeneracy. Because of these useful features, XSTREAM has significant potential for the discovery of naturally-evolved modular proteins with applications for engineering novel biostructural and biomimetic materials, and identifying new vaccine and diagnostic targets.
View details for DOI 10.1186/1471-2105-8-382
View details for Web of Science ID 000252936900001
View details for PubMedID 17931424