Hanlee P. Ji

Professor of Medicine (Oncology) and, by courtesy of Electrical Engineering

Medicine - Oncology

Practices at Stanford Health Care

Web page: http://dna-discovery.stanford.edu

Clinical Focus

Cancer > GI Oncology
Medical Oncology
Oncology (Cancer)
Gastrointestinal Neoplasms
Inherited Cancer Disorders
Immunotherapy in gastrointestinal cancers

Academic Appointments

Professor, Medicine - Oncology
Professor (By courtesy), Electrical Engineering
Member, Bio-X
Member, Stanford Cancer Institute

Administrative Appointments

Department of Medicine Team Science Division Representative, Department of Medicine, Stanford University (2022 - Present)
Senior Associate Director, Stanford Genome Technology Center (2008 - 2020)

Honors & Awards

Physician-Scientist Fellowship Award, Howard Hughes Medical Institute (1998)
American Association Cancer Research, Scholar-in-Training Award for Research Achievement (2005)
Merit Award for Research Achievement, American Society Clinical Oncology Foundation (2006)
Physician Scientist Early Career Award, Howard Hughes Medical Institute (2008)
Clinical Scientist Development Award, Doris Duke Charitable Foundation (2009)
Research Scholar Award, American Cancer Society (2013)

Professional Education

Fellowship: Stanford University Division of Oncology (2005) CA
Board Certification: American Board of Internal Medicine, Medical Oncology (2004)
Residency: University of Iowa Hospitals and Clinics (1996) IA
Medical Education: Johns Hopkins University School of Medicine (1994) MD
B.A., Reed College, Biology
M.D., Johns Hopkins University, Medicine

Contact

Academic
University - Faculty Department: Medicine - Med/Oncology Position: Professor

Alternate Contact Donna Galvez Administration Ji Research Group drgalvez@stanford.edu

Clinical (Primary) Medical Oncology 875 Blake Wilbur Dr Clinic A MC 6560 Stanford, CA 94305
- (650) 498-6000 (office)
(650) 736-4167 (fax)

Additional Clinical Info

Stanford Health Care

Current Research and Scholarly Interests

Our research group integrates new molecular technology development, advanced computation methods and genome biology to identify targets for therapy in cancer. We are pursuing projects focused on developing new therapies for stomach, bile duct and colon cancer. We also are involved in study the basis of genomic instability by examining chromosome structure.

Ongoing projects include:

1) Immunogenomic approaches to study cancer's interaction with the immune system and improve our understanding of immunotherapy

2) Identification of kinase interactions which can improve targeted therapy strategies

3) Use of advanced genome sequencing technologies including nanopore sequencers to understand the role of cancer rearrangements in response to therapy

4) Identifying genes that increase the risk of developing cancer

5) Developing new approaches for monitoring cancer from circulating DNA

We are developing new technologies for data storage using DNA technologies.

Clinical Trials

Clinical and Pathological Studies of Upper Gastrointestinal Carcinoma Recruiting

Our research of the biology of upper gastrointestinal cancers involves the study of tissue samples and cells from biopsies of persons with gastric or esophageal cancer or blood samples from upper gastrointestinal cancer patients and persons at high inherited risk for these cancers. We hope to learn the role genes and proteins play in the development of gastric and esophageal cancer.

View full details
The Gastric Cancer Registry Recruiting

The Gastric Cancer Registry will combine data acquired directly from patients with gastric cancer; with a family history of gastric cancer in a first or second degree relative; or persons with a known germline mutation in their CDH1 (E-Cadherin) gene via an online questionnaire with genomic data obtained from saliva, blood and tissue samples. The purpose of this registry is to gain better understanding of the causes of gastric cancer, both environmental and genetic; whether certain genomic data can predict outcomes of treatment and survival.

View full details

2025-26 Courses

Independent Studies (8)
- Biomedical Informatics Teaching Methods
  BMDS 295 (Aut, Win, Spr)
- Directed Reading
  BMDS 299 (Aut, Win, Spr)
- Directed Reading in Medicine
  MED 299 (Aut, Win, Spr, Sum)
- Early Clinical Experience in Medicine
  MED 280 (Aut, Win, Spr, Sum)
- Graduate Research
  MED 399 (Aut, Win, Spr, Sum)
- Medical Scholars Research
  BMDS 370 (Aut, Win, Spr)
- Medical Scholars Research
  MED 370 (Aut, Win, Spr, Sum)
- Undergraduate Research
  MED 199 (Aut, Win, Spr, Sum)
Prior Year Courses
2024-25 Courses
- Translational Genomics Methods
  MED 212C (Spr)
2023-24 Courses
- MTRAM Translational Technologies (TR): Translational genomics
  MED 212C (Spr)
2022-23 Courses
- MTRAM Translational Technologies (TR): Translational genomics
  MED 212C (Spr)

Stanford Advisees

Doctoral Dissertation Reader (AC)
Youlim Kim
Postdoctoral Faculty Sponsor
Xiangqi Bai, Junha Cha, Tianqi Chen, Ji In Kang, Dongin Lee, KyungTae Lee, Jason Liu, Huiyun Sun, Ignacio Wichmann Perez
Postdoctoral Research Mentor
Xiangqi Bai, Tianqi Chen

Graduate and Fellowship Programs

Biomedical Data Science (Phd Program)
Cancer Biology (Phd Program)
Medicine (Masters Program)

All Publications

The single-cell spatial landscape of stage III colorectal cancers. NPJ precision oncology Su, A., Lee, H., Tran, M., Dela Cruz, R. C., Sathe, A., Bai, X., Wichmann, I., Pflieger, L., Moulton, B., Barker, T., Haslem, D., Jones, D., Nadauld, L., Nguyen, Q., Ji, H. P., Rhodes, T. 2025; 9 (1): 101

Abstract

We conducted a spatial analysis of stage III colorectal adenocarcinomas using Hyperion Imaging Mass Cytometry, examining 52 tumors to assess the tumor microenvironment at the single-cell level. This approach identified 10 distinct cell phenotypes in the tumor microenvironment, including stromal and immune cells, with a subset showing a proliferative phenotype. By focusing on spatial neighborhood interactions and tissue niches, particularly regions with tumor-infiltrating lymphocytes, we investigated how cellular organization relates to clinicopathological and molecular features such as microsatellite instability (MSI) and recurrence. We determined that microsatellite stable (MSS) colorectal cancers had an increased risk of recurrence if they had the following features: 1) a low level of stromal tumor-infiltrating lymphocytes, and 2) low interactions between CD4 + T cells and stromal cells. Our results point to the utility of spatial single-cell interaction analysis in defining novel features of the tumor immune microenvironments and providing useful clinical cell-related spatial biomarkers.

View details for DOI 10.1038/s41698-025-00853-5

View details for PubMedID 40189697

View details for PubMedCentralID PMC11973205
A spatial transcriptomic signature of 26 genes resolved at single-cell resolution characterizes high-risk gastric cancer precursors. NPJ precision oncology Huang, R. J., Wichmann, I. A., Su, A., Sathe, A., Shum, M. V., Grimes, S. M., Meka, R., Almeda, A., Bai, X., Shen, J., Nguyen, Q., Luo, I., Han, S. S., Amieva, M. R., Hwang, J. H., Ji, H. P. 2025; 9 (1): 52

Abstract

Gastric cancer precursors demonstrate highly-variable rates of progression toward neoplasia. Certain high-risk precursors, such as gastric intestinal metaplasia with advanced histologic features, may be at up to 30-fold increased risk for progression compared to lower-risk intestinal metaplasia. The biological differences between high- and low-risk lesions have been incompletely explored. In this study, we use several clinical cohorts to characterize the microenvironment of advanced gastric cancer precursors relative to low-risk lesions using bulk, spatial, and single-cell gene expression assays. We identified a 26-gene panel which is associated with advanced lesions, localizes to metaplastic glands on histopathology, and is expressed in aberrant mature and immature intestinal cells not normally present in the healthy stomach. This gene expression signature suggests an important role of the immature intestinal lineages in promoting carcinogenesis in the metaplastic microenvironment. These findings may help to inform future biomarker development and strategies of gastric cancer prevention.

View details for DOI 10.1038/s41698-025-00816-w

View details for PubMedID 40000871

View details for PubMedCentralID 5879496
Distinct gene signatures define the epithelial cell features of mucinous appendiceal neoplasms and pseudomyxoma metastases. Frontiers in genetics Ayala, C., Sathe, A., Bai, X., Grimes, S. M., Shen, J., Poultsides, G. A., Lee, B., Ji, H. P. 2025; 16: 1536982

Abstract

Appendiceal mucinous neoplasms (AMN) are rare tumors of the gastrointestinal tract. They metastasize with widespread abdominal dissemination leading to pseudomyxoma peritonei (PMP), a disease with poor prognosis. There are many unknowns about the cellular features of origin, differentiation and progression of AMN and PMP.We characterized AMNs, PMPs and matched normal tissues using single-cell RNA-sequencing. We validated our findings with immunohistochemistry, mass spectrometry on malignant ascites from PMP patients and gene expression data from an independent set of PMP tumors.We identified previously undescribed cellular features and heterogeneity in AMN and PMP tumors. There were gene expression signatures specific to the tumor epithelial cells among AMN and PMP. These signatures included genes indicative of goblet cell differentiation and elevated mucin gene expression. Metastatic PMP cells had a distinct gene expression signature with increased lipid metabolism, inflammatory, JAK-STAT and RAS signaling pathway among others. We observed clonal heterogeneity in a single PMP tumor as well as PMP metastases from the same patient.Our study defined tumor cell gene signatures of AMN and PMP, successfully overcoming challenges of low cellularity and mucinous composition of these tumors. These gene expression signatures provide insights on tumor origin and differentiation, together with the identification of novel treatment targets. The heterogeneity observed within an individual tumor and between different tumors from the same patient, represents a potential source of treatment resistance.

View details for DOI 10.3389/fgene.2025.1536982

View details for PubMedID 40018643

View details for PubMedCentralID PMC11865047
The Gastric Cancer Registry: A multi-omic cellular and molecular resource for cancer biomarker and therapeutic discovery. Ji, H. P., Meka, R., Perez, I., Grimes, S. M., Lee, H., Wang, Y., Sathe, A. LIPPINCOTT WILLIAMS & WILKINS. 2025: 491

View details for DOI 10.1200/JCO.2025.43.4_suppl.491

View details for Web of Science ID 001417516100019
A phase I clinical trial adding OX40 agonism to in situ therapeutic cancer vaccination in patients with low-grade B cell lymphoma highlights challenges in translation from mouse to human studies. Clinical cancer research : an official journal of the American Association for Cancer Research Shree, T., Czerwinski, D., Haebe, S., Sathe, A., Grimes, S., Martin, B., Ozawa, M., Hoppe, R., Ji, H., Levy, R. 2025

Abstract

Activating T cell costimulatory receptors is a promising approach for cancer immunotherapy. In preclinical work, adding an OX40 agonist to in situ vaccination (ISV) with SD101, a TLR9 agonist, was curative in a mouse model of lymphoma. We sought to test this combination in a Phase I clinical trial for patients with low-grade B cell lymphoma.We treated 14 patients with low-dose radiation, intratumoral SD101, and intratumoral and intravenous BMS986178, an agonistic anti-OX40 antibody. The primary outcome was safety. Secondary outcomes included overall response rate and progression-free survival.Adverse events were consistent with prior experience with low-dose radiation and SD101. No synergistic or dose-limiting toxicities were observed. One patient had a partial response, and 9 patients had stable disease, a result inferior to our experience with TLR9 agonism and low-dose radiation alone. Flow cytometry and single cell RNA sequencing of serial tumor biopsies revealed that T and NK cells were activated after treatment. However, high baseline OX40 expression on T follicular helper and T regulatory type 1 cells, as well as high post-treatment soluble OX40, shed from these T cells upon activation, associated with progression-free survival of less than 6 months.Clinical results of T cell costimulatory receptor agonism have now repeatedly been inferior to the motivating preclinical results. Our study highlights potential barriers to clinical translation, particularly differences in preclinical and clinical reagents and the complex biology of these coreceptors in heterogenous T cell subpopulations, some of which may antagonize immunotherapy.

View details for DOI 10.1158/1078-0432.CCR-24-2770

View details for PubMedID 39745391
Altered chromatin landscape and 3D interactions associated with primary constitutional MLH1 epimutations. Clinical epigenetics Climent-Cantó, P., Subirana-Granés, M., Ramos-Rodríguez, M., Dámaso, E., Marín, F., Vara, C., Pérez-González, B., Raurell, H., Munté, E., Soto, J. L., Alonso, Á., Shin, G., Ji, H., Hitchins, M., Capellá, G., Pasquali, L., Pineda, M. 2024; 16 (1): 193

Abstract

Lynch syndrome (LS), characterised by an increased risk for cancer, is mainly caused by germline pathogenic variants affecting a mismatch repair gene (MLH1, MSH2, MSH6, PMS2). Occasionally, LS may be caused by constitutional MLH1 epimutation (CME) characterised by soma-wide methylation of one allele of the MLH1 promoter. Most of these are "primary" epimutations, arising de novo without any apparent underlying cis-genetic cause, and are reversible between generations. We aimed to characterise genetic and gene regulatory changes associated with primary CME to elucidate possible underlying molecular mechanisms.Four carriers of a primary CME and three non-methylated relatives carrying the same genetic haplotype were included. Genetic alterations were sought using linked-read WGS in blood DNA. Transcriptome (RNA-seq), chromatin landscape (ATAC-seq, H3K27ac CUT&Tag) and 3D chromatin interactions (UMI-4C) were studied in lymphoblastoid cell lines. The MLH1 promoter SNP (c.-93G > A, rs1800734) was used as a reporter in heterozygotes to assess allele-specific chromatin conformation states.MLH1 epimutant alleles presented a closed chromatin conformation and decreased levels of H3K27ac, as compared to the unmethylated allele. Moreover, the epimutant MLH1 promoter exhibited differential 3D chromatin contacts, including lost and gained interactions with distal regulatory elements. Of note, rare genetic alterations potentially affecting transcription factor binding sites were found in the promoter-contacting region of CME carriers.Primary CMEs present allele-specific differential interaction patterns with neighbouring genes and regulatory elements. The role of the identified cis-regulatory regions in the molecular mechanism underlying the origin and maintenance of CME requires further investigation.

View details for DOI 10.1186/s13148-024-01770-3

View details for PubMedID 39741348

View details for PubMedCentralID 6203851
The single-cell spatial landscape of stage III colorectal cancers. bioRxiv : the preprint server for biology Su, A., Lee, H., Tran, M., Cruz, R. D., Sathe, A., Bai, X., Wichmann, I., Pflieger, L., Moulton, B., Barker, T., Haslem, D., Jones, D., Nadauld, L., Nguyen, Q., Ji, H. P., Rhodes, T. 2024

Abstract

We conducted a spatial analysis using imaging mass cytometry applied to stage III colorectal adenocarcinomas. This study used multiplexed markers to distinguish individual cells and their spatial organization from 52 colorectal cancers. We determined the landscape features of cellular spatial features in the CRC tumor microenvironment. This spatial single-cell analysis identified 10 unique cell phenotypes in the tumor microenvironment that included stromal and immune cells with a subset which had a proliferative phenotype. These special features included spatial neighborhood interactions between single cells as well as different tissue niches, especially the tumor infiltrating lymphocyte regions. We applied a robust statistical analysis to identify significant correlations of cell features with phenotypes such as microsatellite instability or recurrence. We determined that microsatellite stable (MSS) colorectal cancers had an increased risk of recurrence if they had the following features: 1) a low level of stromal tumor-infiltrating lymphocytes, and 2) low interactions between CD4+ T cells and stromal cells. Our results point to the utility of spatial single-cell interaction analysis in defining novel features of the tumor immune microenvironments and providing useful clinical cell-related spatial biomarkers.

View details for DOI 10.1101/2024.11.07.622577

View details for PubMedID 39605367

View details for PubMedCentralID PMC11601238
Brain metastases from esophageal cancer: A retrospective review from a single institution. World neurosurgery Touponse, G. C., Li, G., Tai, J. W., Rodrigues, A. J., Granucci, M., Burnside, G., Bhambhvani, H. P., Han, S. S., Ji, H. P., Gephart, M. H. 2024

Abstract

Patients with brain metastases (BrM) from esophageal cancer have poor prognosis, the incidence of which is expected to rise due to improved survival from the primary tumor and increased neuroimaging. We aimed to identify patient and esophageal cancer characteristics associated with shorter survival in patients with BrM and, secondly, to compare the prognosis of patients with HER2 overexpression.We retrospectively reviewed patients with BrM from esophageal cancer at a single institution from 2008-2021. We collected patient demographics, primary tumor and BrM characteristics, and treatment. Our primary outcome was median survival from the time of BrM.The median age at primary diagnosis was 66.5 years and 86% were male. Of the 49 patients, 71% had adenocarcinoma, 20% squamous cell carcinoma and 8% other. 71% of patients presented with stage III or IV disease, including 16% with synchronous primary and BrM. The median time to BrM was 10.1 months (IQR 1.7-22.8) and the median survival from BrM was 8.4 months (95%CI 4.8-16.8). On multivariable analysis, treatment with stereotactic radiosurgery (HR=0.19;p=0.04), surgical resection (HR 0.24;p=0.03) and immunotherapy (HR 0.19;p=0.04) were associated with increased survival while KPS ≤70 (HR=13.2;p<0.001) was associated with decreased survival. HER2 overexpression was found in 22% of patients, but we noted no survival difference (5.2 months HER2+ versus 9.8 months HER2neg;p=0.95).The median survival from esophageal-to-brain metastasis was 8.4 months. Patients with a single lesion, KPS score >70, and treatment with surgical resection was correlated with improved survival. Further, HER2+ patients had distinct patient and BrM characteristics.

View details for DOI 10.1016/j.wneu.2024.09.085

View details for PubMedID 39321918
Resolving the 22q11.2 deletion using CTLR-Seq reveals chromosomal rearrangement mechanisms and individual variance in breakpoints. Proceedings of the National Academy of Sciences of the United States of America Zhou, B., Purmann, C., Guo, H., Shin, G., Huang, Y., Pattni, R., Meng, Q., Greer, S. U., Roychowdhury, T., Wood, R. N., Ho, M., Dohna, H. Z., Abyzov, A., Hallmayer, J. F., Wong, W. H., Ji, H. P., Urban, A. E. 2024; 121 (31): e2322834121

Abstract

We developed a generally applicable method, CRISPR/Cas9-targeted long-read sequencing (CTLR-Seq), to resolve, haplotype-specifically, the large and complex regions in the human genome that had been previously impenetrable to sequencing analysis, such as large segmental duplications (SegDups) and their associated genome rearrangements. CTLR-Seq combines in vitro Cas9-mediated cutting of the genome and pulse-field gel electrophoresis to isolate intact large (i.e., up to 2,000 kb) genomic regions that encompass previously unresolvable genomic sequences. These targets are then sequenced (amplification-free) at high on-target coverage using long-read sequencing, allowing for their complete sequence assembly. We applied CTLR-Seq to the SegDup-mediated rearrangements that constitute the boundaries of, and give rise to, the 22q11.2 Deletion Syndrome (22q11DS), the most common human microdeletion disorder. We then performed de novo assembly to resolve, at base-pair resolution, the full sequence rearrangements and exact chromosomal breakpoints of 22q11.2DS (including all common subtypes). Across multiple patients, we found a high degree of variability for both the rearranged SegDup sequences and the exact chromosomal breakpoint locations, which coincide with various transposons within the 22q11.2 SegDups, suggesting that 22q11DS can be driven by transposon-mediated genome recombination. Guided by CTLR-Seq results from two 22q11DS patients, we performed three-dimensional chromosomal folding analysis for the 22q11.2 SegDups from patient-derived neurons and astrocytes and found chromosome interactions anchored within the SegDups to be both cell type-specific and patient-specific. Lastly, we demonstrated that CTLR-Seq enables cell-type specific analysis of DNA methylation patterns within the deletion haplotype of 22q11DS.

View details for DOI 10.1073/pnas.2322834121

View details for PubMedID 39042694
Single cell transcriptomic analysis reveals differences between primary appendiceal tumors Ayala, C. I., Sathe, A., Bai, X., Grimes, S., Lee, B., Ji, H. P. SPRINGER. 2024: S230

View details for Web of Science ID 001185577500517
Niche-DE: niche-differential gene expression analysis in spatial transcriptomics data identifies context-dependent cell-cell interactions. Genome biology Mason, K., Sathe, A., Hess, P. R., Rong, J., Wu, C. Y., Furth, E., Susztak, K., Levinsohn, J., Ji, H. P., Zhang, N. 2024; 25 (1): 14

Abstract

Existing methods for analysis of spatial transcriptomic data focus on delineating the global gene expression variations of cell types across the tissue, rather than local gene expression changes driven by cell-cell interactions. We propose a new statistical procedure called niche-differential expression (niche-DE) analysis that identifies cell-type-specific niche-associated genes, which are differentially expressed within a specific cell type in the context of specific spatial niches. We further develop niche-LR, a method to reveal ligand-receptor signaling mechanisms that underlie niche-differential gene expression patterns. Niche-DE and niche-LR are applicable to low-resolution spot-based spatial transcriptomics data and data that is single-cell or subcellular in resolution.

View details for DOI 10.1186/s13059-023-03159-6

View details for PubMedID 38217002

View details for PubMedCentralID 6765259
Detection and analysis of complex structural variation in human genomes across populations and in brains of donors with psychiatric disorders Cell Zhou, B., Arthur, J. G., Guo, H., et al 2024; Published online September 30, 2024

View details for DOI 10.1016/j.cell.2024.09.014
GITR and TIGIT immunotherapy provokes divergent multicellular responses in the tumor microenvironment of gastrointestinal cancers. Genome medicine Sathe, A., Ayala, C., Bai, X., Grimes, S. M., Lee, B., Kin, C., Shelton, A., Poultsides, G., Ji, H. P. 2023; 15 (1): 100

Abstract

Understanding the mechanistic effects of novel immunotherapy agents is critical to improving their successful clinical translation. These effects need to be studied in preclinical models that maintain the heterogenous tumor microenvironment (TME) and dysfunctional cell states found in a patient's tumor. We investigated immunotherapy perturbations targeting co-stimulatory molecule GITR and co-inhibitory immune checkpoint TIGIT in a patient-derived ex vivo system that maintains the TME in its near-native state. Leveraging single-cell genomics, we identified cell type-specific transcriptional reprogramming in response to immunotherapy perturbations.We generated ex vivo tumor slice cultures from fresh surgical resections of gastric and colon cancer and treated them with GITR agonist or TIGIT antagonist antibodies. We applied paired single-cell RNA and TCR sequencing to the original surgical resections, control, and treated ex vivo tumor slice cultures. We additionally confirmed target expression using multiplex immunofluorescence and validated our findings with RNA in situ hybridization.We confirmed that tumor slice cultures maintained the cell types, transcriptional cell states and proportions of the original surgical resection. The GITR agonist was limited to increasing effector gene expression only in cytotoxic CD8 T cells. Dysfunctional exhausted CD8 T cells did not respond to GITR agonist. In contrast, the TIGIT antagonist increased TCR signaling and activated both cytotoxic and dysfunctional CD8 T cells. This included cells corresponding to TCR clonotypes with features indicative of potential tumor antigen reactivity. The TIGIT antagonist also activated T follicular helper-like cells and dendritic cells, and reduced markers of immunosuppression in regulatory T cells.We identified novel cellular mechanisms of action of GITR and TIGIT immunotherapy in the patients' TME. Unlike the GITR agonist that generated a limited transcriptional response, TIGIT antagonist orchestrated a multicellular response involving CD8 T cells, T follicular helper-like cells, dendritic cells, and regulatory T cells. Our experimental strategy combining single-cell genomics with preclinical models can successfully identify mechanisms of action of novel immunotherapy agents. Understanding the cellular and transcriptional mechanisms of response or resistance will aid in prioritization of targets and their clinical translation.

View details for DOI 10.1186/s13073-023-01259-3

View details for PubMedID 38008725

View details for PubMedCentralID PMC10680277
A clinical trial of therapeutic vaccination in lymphoma with serial tumor sampling and single cell analysis. Blood advances Shree, T., Haebe, S. E., Czerwinski, D. K., Eckhert, E., Day, G., Sathe, A., Grimes, S. M., Frank, M. J., Maeda, L., Alizadeh, A. A., Advani, R. H., Hoppe, R. T., Long, S. R., Martin, B. A., Ozawa, M. G., Khodadoust, M. S., Ji, H. P., Levy, R. 2023

Abstract

In situ vaccination (ISV) triggers an immune response to tumor-associated antigens at one tumor site that can then tackle disease throughout the body. Here we report clinical and biological results of a phase I/II ISV trial in patients with low-grade lymphoma (NCT02927964) combining an intratumoral TLR9 agonist with local low-dose radiation, and ibrutinib (an inhibitor of B and T cell kinases). Adverse events were predominately low grade. The overall response rate was 50%, including one complete response. All patients experienced tumor reduction at distant sites. Single cell analyses of serial fine needle aspirates from injected and uninjected tumors revealed correlates of clinical response, such as lower CD47 and higher MHCII expression on tumor cells, enhanced T and NK cell effector function, and reduced immune suppression from TGFß and inhibitory T regulatory 1 cells. While changes at the local injected site were more pronounced, changes at distant uninjected sites more often associated with clinical responses. Functional immune response assays and tracking of T cell receptor sequences provided evidence of treatment-induced tumor-specific T cell responses. Induction of immune effectors and reversal of negative regulators were both important in producing clinically meaningful tumor responses. NCT02927964.

View details for DOI 10.1182/bloodadvances.2023011589

View details for PubMedID 37939259
Co-Occurrence of Clonally Related Follicular Lymphoma and Histiocytic Sarcoma Haebe, S., Czerwinski, D. K., Sathe, A., Grimes, S., Chen, T., Martin, B., Ji, H., Levy, R., Shree, T. AMER SOC HEMATOLOGY. 2023

View details for DOI 10.1182/blood-2023-189712

View details for Web of Science ID 001159900804058
A spatially mapped gene expression signature for intestinal stem-like cells identifies high-risk precursors of gastric cancer. bioRxiv : the preprint server for biology Huang, R. J., Wichmann, I. A., Su, A., Sathe, A., Shum, M. V., Grimes, S. M., Meka, R., Almeda, A., Bai, X., Shen, J., Nguyen, Q., Amieva, M. R., Hwang, J. H., Ji, H. P. 2023

Abstract

Gastric intestinal metaplasia (GIM) is a precancerous lesion that increases gastric cancer (GC) risk. The Operative Link on GIM (OLGIM) is a combined clinical-histopathologic system to risk-stratify patients with GIM. The identification of molecular biomarkers that are indicators for advanced OLGIM lesions may improve cancer prevention efforts.This study was based on clinical and genomic data from four cohorts: 1) GAPS, a GIM cohort with detailed OLGIM severity scoring (N=303 samples); 2) the Cancer Genome Atlas (N=198); 3) a collation of in-house and publicly available scRNA-seq data (N=40), and 4) a spatial validation cohort (N=5) consisting of annotated histology slides of patients with either GC or advanced GIM. We used a multi-omics pipeline to identify, validate and sequentially parse a highly-refined signature of 26 genes which characterize high-risk GIM.Using standard RNA-seq, we analyzed two separate, non-overlapping discovery (N=88) and validation (N=215) sets of GIM. In the discovery phase, we identified 105 upregulated genes specific for high-risk GIM (defined as OLGIM III-IV), of which 100 genes were independently confirmed in the validation set. Spatial transcriptomic profiling revealed 36 of these 100 genes to be expressed in metaplastic foci in GIM. Comparison with bulk GC sequencing data revealed 26 of these genes to be expressed in intestinal-type GC. Single-cell profiling resolved the 26-gene signature to both mature intestinal lineages (goblet cells, enterocytes) and immature intestinal lineages (stem-like cells). A subset of these genes was further validated using single-molecule multiplex fluorescence in situ hybridization. We found certain genes (TFF3 and ANPEP) to mark differentiated intestinal lineages, whereas others (OLFM4 and CPS1) localized to immature cells in the isthmic/crypt region of metaplastic glands, consistent with the findings from scRNAseq analysis.using an integrated multi-omics approach, we identified a novel 26-gene expression signature for high-OLGIM precursors at increased risk for GC. We found this signature localizes to aberrant intestinal stem-like cells within the metaplastic microenvironment. These findings hold important translational significance for future prevention and early detection efforts.

View details for DOI 10.1101/2023.09.20.558462

View details for PubMedID 37786704

View details for PubMedCentralID PMC10541579
Direct measurement of engineered cancer mutations and their transcriptional phenotypes in single cells. Nature biotechnology Kim, H. S., Grimes, S. M., Chen, T., Sathe, A., Lau, B. T., Hwang, G. H., Bae, S., Ji, H. P. 2023

Abstract

Genome sequencing studies have identified numerous cancer mutations across a wide spectrum of tumor types, but determining the phenotypic consequence of these mutations remains a challenge. Here, we developed a high-throughput, multiplexed single-cell technology called TISCC-seq to engineer predesignated mutations in cells using CRISPR base editors, directly delineate their genotype among individual cells and determine each mutation's transcriptional phenotype. Long-read sequencing of the target gene's transcript identifies the engineered mutations, and the transcriptome profile from the same set of cells is simultaneously analyzed by short-read sequencing. Through integration, we determine the mutations' genotype and expression phenotype at single-cell resolution. Using cell lines, we engineer and evaluate the impact of >100 TP53 mutations on gene expression. Based on the single-cell gene expression, we classify the mutations as having a functionally significant phenotype.

View details for DOI 10.1038/s41587-023-01949-8

View details for PubMedID 37697151

View details for PubMedCentralID 8018281
Follicular lymphoma evolves with a surmountable dependency on acquired glycosylation motifs in the B cell receptor. Blood Haebe, S. E., Day, G., Czerwinski, D. K., Sathe, A., Grimes, S. M., Chen, T., Long, S. R., Martin, B. A., Ozawa, M. G., Ji, H. P., Shree, T., Levy, R. 2023

Abstract

An early event in the genesis of follicular lymphoma (FL) is the acquisition of new glycosylation motifs in the B cell receptor (BCR) due to gene rearrangement and/or somatic hypermutation. These N-linked glycosylation motifs (N-motifs) contain mannose-terminated glycans and can interact with lectins in the tumor microenvironment, activating the tumor BCR pathway. N-motifs are stable during FL evolution suggesting that FL tumor cells are dependent on them for their survival. Here, we investigated the dynamics and potential impact of N-motif prevalence in FL at the single cell level across distinct tumor sites and over time in 17 patients. While most patients had acquired at least one N-motif as an early event, we also found (i) cases without N-motifs in the heavy or light chains at any tumor site or timepoint and (ii) cases with discordant N-motif patterns across different tumor sites. Inferring phylogenetic trees for the patients with discordant patterns, we observed that both N-motif-positive and N-motif-negative tumor subclones could be selected and expanded during tumor evolution. Comparing N-motif-positive to N-motif-negative tumor cells within a patient revealed higher expression of genes involved in the BCR pathway and inflammatory response, while tumor cells without N-motifs had higher activity of pathways involved in energy metabolism. In conclusion, while acquired N-motifs likely support FL pathogenesis through antigen-independent BCR signaling in most FL patients, N-motif-negative tumor cells can also be selected and expanded and may depend more heavily on altered metabolism for competitive survival.

View details for DOI 10.1182/blood.2023020360

View details for PubMedID 37683139
Single-cell multi-gene identification of somatic mutations and gene rearrangements in cancer. NAR cancer Grimes, S. M., Kim, H. S., Roy, S., Sathe, A., Ayala, C. I., Bai, X., Almeda-Notestine, A. F., Haebe, S., Shree, T., Levy, R., Lau, B. T., Ji, H. P. 2023; 5 (3): zcad034

Abstract

In this proof-of-concept study, we developed a single-cell method that provides genotypes of somatic alterations found in coding regions of messenger RNAs and integrates these transcript-based variants with their matching cell transcriptomes. We used nanopore adaptive sampling on single-cell complementary DNA libraries to validate coding variants in target gene transcripts, and short-read sequencing to characterize cell types harboring the mutations. CRISPR edits for 16 targets were identified using a cancer cell line, and known variants in the cell line were validated using a 352-gene panel. Variants in primary cancer samples were validated using target gene panels ranging from 161 to 529 genes. A gene rearrangement was also identified in one patient, with the rearrangement occurring in two distinct tumor sites.

View details for DOI 10.1093/narcan/zcad034

View details for PubMedID 37435532

View details for PubMedCentralID PMC10331933
Pan-conserved segment tags identify ultra-conserved sequences across assemblies in the human pangenome. Cell reports methods Lee, H., Greer, S. U., Pavlichin, D. S., Zhou, B., Urban, A. E., Weissman, T., Ji, H. P. 2023; 3 (8): 100543

Abstract

The human pangenome, a new reference sequence, addresses many limitations of the current GRCh38 reference. The first release is based on 94 high-quality haploid assemblies from individuals with diverse backgrounds. We employed a k-mer indexing strategy for comparative analysis across multiple assemblies, including the pangenome reference, GRCh38, and CHM13, a telomere-to-telomere reference assembly. Our k-mer indexing approach enabled us to identify a valuable collection of universally conserved sequences across all assemblies, referred to as "pan-conserved segment tags" (PSTs). By examining intervals between these segments, we discerned highly conserved genomic segments and those with structurally related polymorphisms. We found 60,764 polymorphic intervals with unique geo-ethnic features in the pangenome reference. In this study, we utilized ultra-conserved sequences (PSTs) to forge a link between human pangenome assemblies and reference genomes. This methodology enables the examination of any sequence of interest within the pangenome, using the reference genome as a comparative framework.

View details for DOI 10.1016/j.crmeth.2023.100543

View details for PubMedID 37671027

View details for PubMedCentralID PMC10475782
Transitioning single-cell genomics into the clinic. Nature reviews. Genetics Lim, J., Chin, V., Fairfax, K., Moutinho, C., Suan, D., Ji, H., Powell, J. E. 2023

Abstract

The use of genomics is firmly established in clinical practice, resulting in innovations across a wide range of disciplines such as genetic screening, rare disease diagnosis and molecularly guided therapy choice. This new field of genomic medicine has led to improvements in patient outcomes. However, most clinical applications of genomics rely on information generated from bulk approaches, which do not directly capture the genomic variation that underlies cellular heterogeneity. With the advent of single-cell technologies, research is rapidly uncovering how genomic data at cellular resolution can be used to understand disease pathology and mechanisms. Both DNA-based and RNA-based single-cell technologies have the potential to improve existing clinical applications and open new application spaces for genomics in clinical practice, with oncology, immunology and haematology poised for initial adoption. However, challenges in translating cellular genomics from research to a clinical setting must first be overcome.

View details for DOI 10.1038/s41576-023-00613-w

View details for PubMedID 37258725

View details for PubMedCentralID 5835770
Magnetic DNA random access memory with nanopore readouts and exponentially-scaled combinatorial addressing. Scientific reports Lau, B., Chandak, S., Roy, S., Tatwawadi, K., Wootters, M., Weissman, T., Ji, H. P. 2023; 13 (1): 8514

Abstract

The storage of data in DNA typically involves encoding and synthesizing data into short oligonucleotides, followed by reading with a sequencing instrument. Major challenges include the molecular consumption of synthesized DNA, basecalling errors, and limitations with scaling up read operations for individual data elements. Addressing these challenges, we describe a DNA storage system called MDRAM (Magnetic DNA-based Random Access Memory) that enables repetitive and efficient readouts of targeted files with nanopore-based sequencing. By conjugating synthesized DNA to magnetic agarose beads, we enabled repeated data readouts while preserving the original DNA analyte and maintaining data readout quality. MDRAM utilizes an efficient convolutional coding scheme that leverages soft information in raw nanopore sequencing signals to achieve information reading costs comparable to Illumina sequencing despite higher error rates. Finally, we demonstrate a proof-of-concept DNA-based proto-filesystem that enables an exponentially-scalable data address space using only small numbers of targeting primers for assembly and readout.

View details for DOI 10.1038/s41598-023-29575-z

View details for PubMedID 37231057
Short Tandem Repeat DNA Profiling Using Perylene-Oligonucleotide Fluorescence Assay. Analytical chemistry Hernandez Bustos, A., Martiny, E., Bom Pedersen, N., Parvathaneni, R. P., Hansen, J., Ji, H. P., Astakhova, K. 2023

Abstract

We report an amplification-free genotyping method to determine the number of human short tandem repeats (STRs). DNA-based STR profiling is a robust method for genetic identification purposes such as forensics and biobanking and for identifying specific molecular subtypes of cancer. STR detection requires polymerase amplification, which introduces errors that obscure the correct genotype. We developed a new method that requires no polymerase. First, we synthesized perylene-nucleoside reagents and incorporated them into oligonucleotide probes that recognize five common human STRs. Using these probes and a bead-based hybridization approach, accurate STR detection was achieved in only 1.5 h, including DNA preparation steps, with up to a 1000-fold target DNA enrichment. This method was comparable to PCR-based assays. Using standard fluorometry, the limit of detection was 2.00 ± 0.07 pM for a given target. We used this assay to accurately identify STRs from 50 human subjects, achieving >98% consensus with sequencing data for STR genotyping.

View details for DOI 10.1021/acs.analchem.3c00063

View details for PubMedID 37183373
Pangenome graph construction from genome alignments with Minigraph-Cactus NATURE BIOTECHNOLOGY Hickey, G., Monlong, J., Ebler, J., Novak, A. M., Eizenga, J. M., Gao, Y., Marschall, T., Li, H., Paten, B., Abel, H. J., Antonacci-Fulton, L. L., Asri, M., Baid, G., Baker, C. A., Belyaeva, A., Billis, K., Bourque, G., Buonaiuto, S., Carroll, A., Chaisson, M. J. P., Chang, P., Chang, X. H., Cheng, H., Chu, J., Cody, S., Colonna, V., Cook, D. E., Cook-Deegan, R. M., Cornejo, O. E., Diekhans, M., Doerr, D., Ebert, P., Ebler, J., Eichler, E. E., Eizenga, J. M., Fairley, S., Fedrigo, O., Felsenfeld, A. L., Feng, X., Fischer, C., Flicek, P., Formenti, G., Frankish, A., Fulton, R. S., Gao, Y., Garg, S., Garrison, E., Garrison, N. A., Giron, C., Green, R. E., Groza, C., Guarracino, A., Haggerty, L., Hall, I. M., Harvey, W. T., Haukness, M., Haussler, D., Heumos, S., Hickey, G., Hoekzema, K., Hourlier, T., Howe, K., Jain, M., Jarvis, E. D., Ji, H. P., Kenny, E. E., Koenig, B. A., Kolesnikov, A., Korbel, J. O., Kordosky, J., Koren, S., Lee, H., Lewis, A. P., Liao, W., Lu, S., Lu, T., Lucas, J. K., Hugo, M., Santiago, M., Marijon, P., Markello, C., Marschall, T., Martin, F. J., McCartney, A., McDaniel, J., Miga, K. H., Mitchell, M. W., Monlong, J., Mountcastle, J., Munson, K. M., Mwaniki, M., Nattestad, M., Novak, A. M., Nurk, S., Olsen, H. E., Olson, N. D., Pesout, T., Phillippy, A. M., Popejoy, A. B., Porubsky, D., Prins, P., Puiu, D., Rautiainen, M., Regier, A. A., Rhie, A., Sacco, S., Sanders, A. D., Schneider, V. A., Schultz, B., Shafin, K., Sibbesen, J. A., Siren, J., Smith, M. W., Sofia, H. J., Abou Tayoun, A. N., Thibaud-Nissen, F., Tomlinson, C., Tricomi, F., Villani, F., Vollger, M. R., Wagner, J., Walenz, B., Wang, T., Wood, J. M. D., Zimin, A., Zook, J. M., Human Pangenome Reference 2023

Abstract

Pangenome references address biases of reference genomes by storing a representative set of diverse haplotypes and their alignment, usually as a graph. Alternate alleles determined by variant callers can be used to construct pangenome graphs, but advances in long-read sequencing are leading to widely available, high-quality phased assemblies. Constructing a pangenome graph directly from assemblies, as opposed to variant calls, leverages the graph's ability to represent variation at different scales. Here we present the Minigraph-Cactus pangenome pipeline, which creates pangenomes directly from whole-genome alignments, and demonstrate its ability to scale to 90 human haplotypes from the Human Pangenome Reference Consortium. The method builds graphs containing all forms of genetic variation while allowing use of current mapping and genotyping tools. We measure the effect of the quality and completeness of reference genomes used for analysis within the pangenomes and show that using the CHM13 reference from the Telomere-to-Telomere Consortium improves the accuracy of our methods. We also demonstrate construction of a Drosophila melanogaster pangenome.

View details for DOI 10.1038/s41587-023-01793-w

View details for Web of Science ID 000992565300001

View details for PubMedID 37165083

View details for PubMedCentralID 8006571
Single-molecule methylation profiles of cell-free DNA in cancer with nanopore sequencing. Genome medicine Lau, B. T., Almeda, A., Schauer, M., McNamara, M., Bai, X., Meng, Q., Partha, M., Grimes, S. M., Lee, H., Heestand, G. M., Ji, H. P. 2023; 15 (1): 33

Abstract

Epigenetic characterization of cell-free DNA (cfDNA) is an emerging approach for detecting and characterizing diseases such as cancer. We developed a strategy using nanopore-based single-molecule sequencing to measure cfDNA methylomes. This approach generated up to 200 million reads for a single cfDNA sample from cancer patients, an order of magnitude improvement over existing nanopore sequencing methods. We developed a single-molecule classifier to determine whether individual reads originated from a tumor or immune cells. Leveraging methylomes of matched tumors and immune cells, we characterized cfDNA methylomes of cancer patients for longitudinal monitoring during treatment.

View details for DOI 10.1186/s13073-023-01178-3

View details for PubMedID 37138315

View details for PubMedCentralID 1283450
A draft human pangenome reference. Nature Liao, W. W., Asri, M., Ebler, J., Doerr, D., Haukness, M., Hickey, G., Lu, S., Lucas, J. K., Monlong, J., Abel, H. J., Buonaiuto, S., Chang, X. H., Cheng, H., Chu, J., Colonna, V., Eizenga, J. M., Feng, X., Fischer, C., Fulton, R. S., Garg, S., Groza, C., Guarracino, A., Harvey, W. T., Heumos, S., Howe, K., Jain, M., Lu, T. Y., Markello, C., Martin, F. J., Mitchell, M. W., Munson, K. M., Mwaniki, M. N., Novak, A. M., Olsen, H. E., Pesout, T., Porubsky, D., Prins, P., Sibbesen, J. A., Sirén, J., Tomlinson, C., Villani, F., Vollger, M. R., Antonacci-Fulton, L. L., Baid, G., Baker, C. A., Belyaeva, A., Billis, K., Carroll, A., Chang, P. C., Cody, S., Cook, D. E., Cook-Deegan, R. M., Cornejo, O. E., Diekhans, M., Ebert, P., Fairley, S., Fedrigo, O., Felsenfeld, A. L., Formenti, G., Frankish, A., Gao, Y., Garrison, N. A., Giron, C. G., Green, R. E., Haggerty, L., Hoekzema, K., Hourlier, T., Ji, H. P., Kenny, E. E., Koenig, B. A., Kolesnikov, A., Korbel, J. O., Kordosky, J., Koren, S., Lee, H., Lewis, A. P., Magalhães, H., Marco-Sola, S., Marijon, P., McCartney, A., McDaniel, J., Mountcastle, J., Nattestad, M., Nurk, S., Olson, N. D., Popejoy, A. B., Puiu, D., Rautiainen, M., Regier, A. A., Rhie, A., Sacco, S., Sanders, A. D., Schneider, V. A., Schultz, B. I., Shafin, K., Smith, M. W., Sofia, H. J., Abou Tayoun, A. N., Thibaud-Nissen, F., Tricomi, F. F., Wagner, J., Walenz, B., Wood, J. M., Zimin, A. V., Bourque, G., Chaisson, M. J., Flicek, P., Phillippy, A. M., Zook, J. M., Eichler, E. E., Haussler, D., Wang, T., Jarvis, E. D., Miga, K. H., Garrison, E., Marschall, T., Hall, I. M., Li, H., Paten, B. 2023; 617 (7960): 312-324

Abstract

Here the Human Pangenome Reference Consortium presents a first draft of the human pangenome reference. The pangenome contains 47 phased, diploid assemblies from a cohort of genetically diverse individuals1. These assemblies cover more than 99% of the expected sequence in each genome and are more than 99% accurate at the structural and base pair levels. Based on alignments of the assemblies, we generate a draft pangenome that captures known variants and haplotypes and reveals new alleles at structurally complex loci. We also add 119 million base pairs of euchromatic polymorphic sequences and 1,115 gene duplications relative to the existing reference GRCh38. Roughly 90 million of the additional base pairs are derived from structural variation. Using our draft pangenome to analyse short-read data reduced small variant discovery errors by 34% and increased the number of structural variants detected per haplotype by 104% compared with GRCh38-based workflows, which enabled the typing of the vast majority of structural variant alleles per sample.

View details for DOI 10.1038/s41586-023-05896-x

View details for PubMedID 37165242

View details for PubMedCentralID PMC10172123
Single cell and spatial alternative splicing analysis with long read sequencing. Research square Fu, Y., Kim, H., Adams, J. I., Grimes, S. M., Huang, S., Lau, B. T., Sathe, A., Hess, P., Ji, H. P., Zhang, N. R. 2023

Abstract

Long-read sequencing has become a powerful tool for alternative splicing analysis. However, technical and computational challenges have limited our ability to explore alternative splicing at single cell and spatial resolution. The higher sequencing error of long reads, especially high indel rates, have limited the accuracy of cell barcode and unique molecular identifier (UMI) recovery. Read truncation and mapping errors, the latter exacerbated by the higher sequencing error rates, can cause the false detection of spurious new isoforms. Downstream, there is yet no rigorous statistical framework to quantify splicing variation within and between cells/spots. In light of these challenges, we developed Longcell, a statistical framework and computational pipeline for accurate isoform quantification for single cell and spatial spot barcoded long read sequencing data. Longcell performs computationally efficient cell/spot barcode extraction, UMI recovery, and UMI-based truncation- and mapping-error correction. Through a statistical model that accounts for varying read coverage across cells/spots, Longcell rigorously quantifies the level of inter-cell/spot versus intra-cell/ spot diversity in exon-usage and detects changes in splicing distributions between cell populations. Applying Longcell to single cell long-read data from multiple contexts, we found that intra-cell splicing heterogeneity, where multiple isoforms co-exist within the same cell, is ubiquitous for highly expressed genes. On matched single cell and Visium long read sequencing for a tissue of colorectal cancer metastasis to the liver, Longcell found concordant signals between the two data modalities. Finally, on a perturbation experiment for 9 splicing factors, Longcell identified regulatory targets that are validated by targeted sequencing.

View details for DOI 10.21203/rs.3.rs-2674892/v1

View details for PubMedID 36993612

View details for PubMedCentralID PMC10055662
GITR and TIGIT immunotherapy provokes divergent multi-cellular responses in the tumor microenvironment of gastrointestinal cancers. bioRxiv : the preprint server for biology Sathe, A., Ayala, C., Bai, X., Grimes, S. M., Lee, B., Kin, C., Shelton, A., Poultsides, G., Ji, H. P. 2023

Abstract

Understanding the cellular mechanisms of novel immunotherapy agents in the human tumor microenvironment (TME) is critical to their clinical success. We examined GITR and TIGIT immunotherapy in gastric and colon cancer patients using ex vivo slice tumor slice cultures derived from cancer surgical resections. This primary culture system maintains the original TME in a near-native state. We applied paired single-cell RNA and TCR sequencing to identify cell type specific transcriptional reprogramming. The GITR agonist was limited to increasing effector gene expression only in cytotoxic CD8 T cells. The TIGIT antagonist increased TCR signaling and activated both cytotoxic and dysfunctional CD8 T cells, including clonotypes indicative of potential tumor antigen reactivity. The TIGIT antagonist also activated T follicular helper-like cells and dendritic cells, and reduced markers of immunosuppression in regulatory T cells. Overall, we identified cellular mechanisms of action of these two immunotherapy targets in the patients' TME.

View details for DOI 10.1101/2023.03.13.532299

View details for PubMedID 36993756

View details for PubMedCentralID PMC10054933
Single Cell Transcriptomic Analysis of Human Extra- and Intra-Hepatic Cholangiocarcinoma Ayala, C. I., Sathe, A., Grimes, S., Bae, X., Dua, M., Poultsides, G., Visser, B., Ji, H. SPRINGER. 2023: S177-S178

View details for Web of Science ID 001046841200386
The Gastric Cancer Registry Genome Explorer: A tool for genomic discovery. Almeda, A., Grimes, S. M., Shin, G., Lee, H., Wichmann, I., Greer, S., Ji, H. P. LIPPINCOTT WILLIAMS & WILKINS. 2023: 434

View details for Web of Science ID 001093994600500
Tumor-associated microbiome features of metastatic colorectal cancer and clinical implications. Frontiers in oncology An, H. J., Partha, M. A., Lee, H., Lau, B. T., Pavlichin, D. S., Almeda, A., Hooker, A. C., Shin, G., Ji, H. P. 2023; 13: 1310054

Abstract

Colon microbiome composition contributes to the pathogenesis of colorectal cancer (CRC) and prognosis. We analyzed 16S rRNA sequencing data from tumor samples of patients with metastatic CRC and determined the clinical implications.We enrolled 133 patients with metastatic CRC at St. Vincent Hospital in Korea. The V3-V4 regions of the 16S rRNA gene from the tumor DNA were amplified, sequenced on an Illumina MiSeq, and analyzed using the DADA2 package.After excluding samples that retained <5% of the total reads after merging, 120 samples were analyzed. The median age of patients was 63 years (range, 34-82 years), and 76 patients (63.3%) were male. The primary cancer sites were the right colon (27.5%), left colon (30.8%), and rectum (41.7%). All subjects received 5-fluouracil-based systemic chemotherapy. After removing genera with <1% of the total reads in each patient, 523 genera were identified. Rectal origin, high CEA level (≥10 ng/mL), and presence of lung metastasis showed higher richness. Survival analysis revealed that the presence of Prevotella (p = 0.052), Fusobacterium (p = 0.002), Selenomonas (p<0.001), Fretibacterium (p = 0.001), Porphyromonas (p = 0.007), Peptostreptococcus (p = 0.002), and Leptotrichia (p = 0.003) were associated with short overall survival (OS, <24 months), while the presence of Sphingomonas was associated with long OS (p = 0.070). From the multivariate analysis, the presence of Selenomonas (hazard ratio [HR], 6.35; 95% confidence interval [CI], 2.38-16.97; p<0.001) was associated with poor prognosis along with high CEA level.Tumor microbiome features may be useful prognostic biomarkers for metastatic CRC.

View details for DOI 10.3389/fonc.2023.1310054

View details for PubMedID 38304032

View details for PubMedCentralID PMC10833227
Large Cancer Pedigree Involving Multiple Cancer Genes including Likely Digenic MSH2 and MSH6 Lynch Syndrome (LS) and an Instance of Recombinational Rescue from LS. Cancers Vogelaar, I. P., Greer, S., Wang, F., Shin, G., Lau, B., Hu, Y., Haraldsdottir, S., Alvarez, R., Hazelett, D., Nguyen, P., Aguirre, F. P., Guindi, M., Hendifar, A., Balcom, J., Leininger, A., Fairbank, B., Ji, H., Hitchins, M. P. 2022; 15 (1)

Abstract

Lynch syndrome (LS), caused by heterozygous pathogenic variants affecting one of the mismatch repair (MMR) genes (MSH2, MLH1, MSH6, PMS2), confers moderate to high risks for colorectal, endometrial, and other cancers. We describe a four-generation, 13-branched pedigree in which multiple LS branches carry the MSH2 pathogenic variant c.2006G>T (p.Gly669Val), one branch has this and an additional novel MSH6 variant c.3936_4001+8dup (intronic), and other non-LS branches carry variants within other cancer-relevant genes (NBN, MC1R, PTPRJ). Both MSH2 c.2006G>T and MSH6 c.3936_4001+8dup caused aberrant RNA splicing in carriers, including out-of-frame exon-skipping, providing functional evidence of their pathogenicity. MSH2 and MSH6 are co-located on Chr2p21, but the two variants segregated independently (mapped in trans) within the digenic branch, with carriers of either or both variants. Thus, MSH2 c.2006G>T and MSH6 c.3936_4001+8dup independently confer LS with differing cancer risks among family members in the same branch. Carriers of both variants have near 100% risk of transmitting either one to offspring. Nevertheless, a female carrier of both variants did not transmit either to one son, due to a germline recombination within the intervening region. Genetic diagnosis, risk stratification, and counseling for cancer and inheritance were highly individualized in this family. The finding of multiple cancer-associated variants in this pedigree illustrates a need to consider offering multicancer gene panel testing, as opposed to targeted cascade testing, as additional cancer variants may be uncovered in relatives.

View details for DOI 10.3390/cancers15010228

View details for PubMedID 36612224
Activating Immune Effectors and Dampening Immune Suppressors Generates Successful Therapeutic Cancer Vaccination in Patients with Lymphoma Shree, T., Haebe, S., Czerwinski, D. K., Eckhert, E., Day, G., Sathe, A., Grimes, S. M., Frank, M. J., Maeda, L. S., Alizadeh, A. A., Advani, R. H., Hoppe, R., Long, S. R., Martin, B., Ozawa, M. G., Khodadoust, M. S., Ji, H. P., Levy, R. AMER SOC HEMATOLOGY. 2022: 6450-6451

View details for DOI 10.1182/blood-2022-167469

View details for Web of Science ID 000893223206208
Prevalence of Acquired N-Glycosylation Sites at the Single Cell Level in Follicular Lymphoma Haebe, S., Shree, T., Day, G., Czerwinski, D. K., Sathe, A., Grimes, S. M., Long, S. R., Martin, B., Ozawa, M. G., Ji, H. P., Levy, R. AMER SOC HEMATOLOGY. 2022: 9211-9212

View details for DOI 10.1182/blood-2022-164763

View details for Web of Science ID 000893230302097
Colorectal cancer metastases in the liver establish immunosuppressive spatial networking between tumor associated SPP1+ macrophages and fibroblasts. Clinical cancer research : an official journal of the American Association for Cancer Research Sathe, A., Mason, K., Grimes, S. M., Zhou, Z., Lau, B. T., Bai, X., Su, A., Tan, X., Lee, H., Suarez, C. J., Nguyen, Q., Poultsides, G., Zhang, N. R., Ji, H. P. 2022

Abstract

The liver is the most frequent metastatic site for colorectal cancer (CRC). Its microenvironment is modified to provide a niche that is conducive for CRC cell growth.This study focused on characterizing the cellular changes in the metastatic CRC (mCRC) liver tumor microenvironment (TME).We analyzed a series of microsatellite stable (MSS) mCRCs to the liver, paired normal liver tissue and peripheral blood mononuclear cells using single cell RNA-seq (scRNA-seq). We validated our findings using multiplexed spatial imaging and bulk gene expression with cell deconvolution.We identified TME-specific SPP1-expressing macrophages with altered metabolism features, foam cell characteristics and increased activity in extracellular matrix (ECM) organization. SPP1+ macrophages and fibroblasts expressed complementary ligand receptor pairs with the potential to mutually influence their gene expression programs. TME lacked dysfunctional CD8 T cells and contained regulatory T cells, indicative of immunosuppression. Spatial imaging validated these cell states in the TME. Moreover, TME macrophages and fibroblasts had close spatial proximity, which is a requirement for intercellular communication and networking.In an independent cohort of mCRCs in the liver, we confirmed the presence of SPP1+ macrophages and fibroblasts using gene expression data. An increased proportion of TME fibroblasts was associated with a worst prognosis in these patients.We demonstrated that mCRC in the liver is characterized by transcriptional alterations of macrophages in the TME. Intercellular networking between macrophages and fibroblasts supports CRC growth in the immunosuppressed metastatic niche in the liver. These features can be used to target immune checkpoint resistant MSS tumors.

View details for DOI 10.1158/1078-0432.CCR-22-2041

View details for PubMedID 36239989
RESOLVING THE EXACT BREAKPOINTS AND SEQUENCE REARRANGEMENTS OF LARGE NEUROPSYCHIATRIC COPY NUMBER VARIATIONS (CNVS) AT SINGLE BASE-PAIR RESOLUTION USING CRISPR-TARGETED ULTRALONG READ SEQUENCING (CTLR-SEQ) Zhou, B., Shin, G., Vervoort, L., Greer, S., Huang, Y., Roychowdhury, T., Pattni, R., Abyzov, A., Vermeesch, J., Ji, H., Urban, A. ELSEVIER. 2022: E88-E89

View details for DOI 10.1016/j.euroneuro.2022.07.166

View details for Web of Science ID 000886075100051
Predictive Model to Guide Brain Magnetic Resonance Imaging Surveillance in Patients With Metastatic Lung Cancer: Impact on Real-World Outcomes. JCO precision oncology Wu, J., Ding, V., Luo, S., Choi, E., Hellyer, J., Myall, N., Henry, S., Wood, D., Stehr, H., Ji, H., Nagpal, S., Hayden Gephart, M., Wakelee, H., Neal, J., Han, S. S. 2022; 6: e2200220

Abstract

Brain metastasis is common in lung cancer, and treatment of brain metastasis can lead to significant morbidity. Although early detection of brain metastasis may improve outcomes, there are no prediction models to identify high-risk patients for brain magnetic resonance imaging (MRI) surveillance. Our goal is to develop a machine learning-based clinicogenomic prediction model to estimate patient-level brain metastasis risk.A penalized regression competing risk model was developed using 330 patients diagnosed with lung cancer between January 2014 and June 2019 and followed through June 2021 at Stanford HealthCare. The main outcome was time from the diagnosis of distant metastatic disease to the development of brain metastasis, death, or censoring.Among the 330 patients, 84 (25%) developed brain metastasis over 627 person-years, with a 1-year cumulative brain metastasis incidence of 10.2% (95% CI, 6.8 to 13.6). Features selected for model inclusion were histology, cancer stage, age at diagnosis, primary site, and RB1 and ALK alterations. The prediction model yielded high discrimination (area under the curve 0.75). When the cohort was stratified by risk using a 1-year risk threshold of > 14.2% (85th percentile), the high-risk group had increased 1-year cumulative incidence of brain metastasis versus the low-risk group (30.8% v 6.1%, P < .01). Of 48 high-risk patients, 24 developed brain metastasis, and of these, 12 patients had brain metastasis detected more than 7 months after last brain MRI. Patients who missed this 7-month window had larger brain metastases (58% v 33% largest diameter > 10 mm; odds ratio, 2.80, CI, 0.51 to 13) versus those who had MRIs more frequently.The proposed model can identify high-risk patients, who may benefit from more intensive brain MRI surveillance to reduce morbidity of subsequent treatment through early detection.

View details for DOI 10.1200/PO.22.00220

View details for PubMedID 36201713
Exploratory genomic analysis of high grade neuroendocrine neoplasms across diverse primary sites. Endocrine-related cancer Sun, T. Y., Zhao, L., Van Hummelen, P., Martin, B., Hornbacker, K., Lee, H., Xia, L. C., Padda, S. K., Ji, H. P., Kunz, P. 2022

Abstract

High grade (grade 3) neuroendocrine neoplasms (G3 NENs) have poor survival outcomes. From a clinical standpoint, G3 NENs are usually grouped regardless of primary site and treated similarly. Little is known regarding the underlying genomics of these rare tumors, especially when compared across different primary sites. We performed whole transcriptome (n = 46), whole exome (n = 40) and gene copy number (n = 43) sequencing on G3 NEN FFPE samples from diverse organs (in total 17 were lung, 16 were gastroenteropancreatic, 13 other). G3 NENs despite arising from diverse primary sites did not have gene expression profiles that were easily segregated by organ of origin. Across all G3 NENs, TP53, APC, RB1 and CDKN2A were significantly mutated. The CDK4/6 cell cycling pathway was mutated in 95% of cases, with upregulation of oncogenes within this pathway. G3 NENs had high tumor mutation burden (mean 7.09 mutations/MB), with 20% having >10 mutations/MB. Two somatic copy number alterations were significantly associated with worse prognosis across tissue types: focal deletion 22q13.31 (HR, 7.82; p = 0.034) and arm amplification 19q (HR, 4.82; p = 0.032). This study is among the most diverse genomic study of high-grade neuroendocrine neoplasms. We uncovered genomic features previously unrecognized for this rapidly fatal and rare cancer type that could have potential prognostic and therapeutic implications.

View details for DOI 10.1530/ERC-22-0015

View details for PubMedID 36165930
The Gastric Cancer Registry: A Genomic Translational Resource for Multidisciplinary Research in Gastric Cancer. Cancer epidemiology, biomarkers & prevention : a publication of the American Association for Cancer Research, cosponsored by the American Society of Preventive Oncology Almeda, A. F., Grimes, S. M., Lee, H., Greer, S., Shin, G., McNamara, M., Hooker, A. C., Arce, M. M., Kubit, M., Schauer, M. C., Van Hummelen, P., Ma, C., Mills, M. A., Huang, R. J., Hwang, J. H., Amieva, M. R., Han, S. S., Ford, J. M., Ji, H. P. 2022

Abstract

Gastric cancer (GC) is a leading cause of cancer morbidity and mortality. Developing information systems which integrate clinical and genomic data may accelerate discoveries to improve cancer prevention, detection, and treatment. To support translational research in GC, we developed the GC Registry (GCR), a North American repository of clinical and cancer genomics data.Participants self-enrolled online. Entry criteria into the GCR included the following: (1) diagnosis of GC, (2) history of GC in a first- or second-degree relative, or (3) known germline mutation in the gene CDH1. Participants provided demographic and clinical information through a detailed survey. Some participants provided specimens of saliva and tumor samples. Tumor samples underwent exome sequencing, whole genome sequencing and transcriptome sequencing.From 2011-2021, 567 individuals registered and returned the clinical questionnaire. For this cohort 65% had a personal history of GC, 36% reported a family history of GC and 14% had a germline CDH1 mutation. 89 GC patients provided tumor samples. For the initial study, 41 tumors were sequenced using next generation sequencing. The data was analyzed for cancer mutations, copy number variations, gene expression, microbiome, neoantigens, immune infiltrates, and other features. We developed a searchable, web-based interface (the GCR Genome Explorer) to enable researchers access to these datasets.The GCR is a unique, North American GC registry which integrates clinical and genomic annotation.Available for researchers through an open access, web-based explorer, the GCR Genome Explorer will accelerate collaborative GC research across the United States and world.

View details for DOI 10.1158/1055-9965.EPI-22-0308

View details for PubMedID 35771165
Germline variants of ATG7 in familial cholangiocarcinoma alter autophagy and p62. Scientific reports Greer, S. U., Chen, J., Ogmundsdottir, M. H., Ayala, C., Lau, B. T., Delacruz, R. G., Sandoval, I. T., Kristjansdottir, S., Jones, D. A., Haslem, D. S., Romero, R., Fulde, G., Bell, J. M., Jonasson, J. G., Steingrimsson, E., Ji, H. P., Nadauld, L. D. 2022; 12 (1): 10333

Abstract

Autophagy is a housekeeping mechanism tasked with eliminating misfolded proteins and damaged organelles to maintain cellular homeostasis. Autophagy deficiency results in increased oxidative stress, DNA damage and chronic cellular injury. Among the core genes in the autophagy machinery, ATG7 is required for autophagy initiation and autophagosome formation. Based on the analysis of an extended pedigree of familial cholangiocarcinoma, we determined that all affected family members had a novel germline mutation (c.2000C>T p.Arg659* (p.R659*)) in ATG7. Somatic deletions of ATG7 were identified in the tumors of affected individuals. We applied linked-read sequencing to one tumor sample and demonstrated that the ATG7 somatic deletion and germline mutation were located on distinct alleles, resulting in two hits to ATG7. From a parallel population genetic study, we identified a germline polymorphism of ATG7 (c.1591C>G p.Asp522Glu (p.D522E)) associated with increased risk of cholangiocarcinoma. To characterize the impact of these germline ATG7 variants on autophagy activity, we developed an ATG7-null cell line derived from the human bile duct. The mutant p.R659* ATG7 protein lacked the ability to lipidate its LC3 substrate, leading to complete loss of autophagy and increased p62 levels. Our findings indicate that germline ATG7 variants have the potential to impact autophagy function with implications for cholangiocarcinoma development.

View details for DOI 10.1038/s41598-022-13569-4

View details for PubMedID 35725745
Reconstructing the spatial evolution of cancer through subclone detection on copy number profiles in tumor sequencing data. Wu, C., Hess, P. R., Sathe, A., Rong, J., Lau, B. T., Grimes, S. M., Ji, H. P., Zhang, N. R. AMER ASSOC CANCER RESEARCH. 2022

View details for Web of Science ID 000892509502259
A single-cell solution for solid tumors to detect mutations and quantify copy number variations. Wu, C., Hess, P. R., Sathe, A., Rong, J., Lau, B. T., Grimes, S. M., Ji, H. P., Zhang, N. R. AMER ASSOC CANCER RESEARCH. 2022

View details for Web of Science ID 000892509502260
Reconstructing the spatial evolution of cancer through subclone detection on copy number profiles in tumor sequencing data Wu, C., Hess, P. R., Sathe, A., Rong, J., Lau, B. T., Grimes, S. M., Ji, H. P., Zhang, N. R. AMER ASSOC CANCER RESEARCH. 2022

View details for Web of Science ID 000892509500207
ALTEN: A High-Fidelity Primary Tissue-Engineering Platform to Assess Cellular Responses Ex Vivo. Advanced science (Weinheim, Baden-Wurttemberg, Germany) Law, A. M., Chen, J., Colino-Sanguino, Y., Fuente, L. R., Fang, G., Grimes, S. M., Lu, H., Huang, R. J., Boyle, S. T., Venhuizen, J., Castillo, L., Tavakoli, J., Skhinas, J. N., Millar, E. K., Beretov, J., Rossello, F. J., Tipper, J. L., Ormandy, C. J., Samuel, M. S., Cox, T. R., Martelotto, L., Jin, D., Valdes-Mora, F., Ji, H. P., Gallego-Ortega, D. 2022: e2103332

Abstract

To fully investigate cellular responses to stimuli and perturbations within tissues, it is essential to replicate the complex molecular interactions within the local microenvironment of cellular niches. Here, the authors introduce Alginate-based tissue engineering (ALTEN), a biomimetic tissue platform that allows ex vivo analysis of explanted tissue biopsies. This method preserves the original characteristics of the source tissue's cellular milieu, allowing multiple and diverse cell types to be maintained over an extended period of time. As a result, ALTEN enables rapid and faithful characterization of perturbations across specific cell types within a tissue. Importantly, using single-cell genomics, this approach provides integrated cellular responses at the resolution of individual cells. ALTEN is a powerful tool for the analysis of cellular responses upon exposure to cytotoxic agents and immunomodulators. Additionally, ALTEN's scalability using automated microfluidic devices for tissue encapsulation and subsequent transport, to enable centralized high-throughput analysis of samples gathered by large-scale multicenter studies, is shown.

View details for DOI 10.1002/advs.202103332

View details for PubMedID 35611998
Mucinous Epithelial Cell Secretion Drives Mucinous Ascites Formation in Pseudomyxoma Peritonei Patients Ayala, C., Sathe, A., Grimes, S., Zhao, L., Bai, X., Poultsides, G., Lee, B., Ji, H. SPRINGER. 2022: 520-521

View details for Web of Science ID 000789811800437
KmerKeys: a web resource for searching indexed genome assemblies and variants. Nucleic acids research Pavlichin, D. S., Lee, H., Greer, S. U., Grimes, S. M., Weissman, T., Ji, H. P. 2022

Abstract

K-mers are short DNA sequences that are used for genome sequence analysis. Applications that use k-mers include genome assembly and alignment. However, the wider bioinformatic use of these short sequences has challenges related to the massive scale of genomic sequence data. A single human genome assembly has billions of k-mers. As a result, the computational requirements for analyzing k-mer information is enormous, particularly when involving complete genome assemblies. To address these issues, we developed a new indexing data structure based on a hash table tuned for the lookup of short sequence keys. This web application, referred to as KmerKeys, provides performant, rapid query speeds for cloud computation on genome assemblies. We enable fuzzy as well as exact sequence searches of assemblies. To enable robust and speedy performance, the website implements cache-friendly hash tables, memory mapping and massive parallel processing. Our method employs a scalable and efficient data structure that can be used to jointly index and search a large collection of human genome assembly information. One can include variant databases and their associated metadata such as the gnomAD population variant catalogue. This feature enables the incorporation of future genomic information into sequencing analysis. KmerKeys is freely accessible at https://kmerkeys.dgi-stanford.org.

View details for DOI 10.1093/nar/gkac266

View details for PubMedID 35474383
The Human Pangenome Project: a global resource to map genomic diversity. Nature Wang, T., Antonacci-Fulton, L., Howe, K., Lawson, H. A., Lucas, J. K., Phillippy, A. M., Popejoy, A. B., Asri, M., Carson, C., Chaisson, M. J., Chang, X., Cook-Deegan, R., Felsenfeld, A. L., Fulton, R. S., Garrison, E. P., Garrison, N. A., Graves-Lindsay, T. A., Ji, H., Kenny, E. E., Koenig, B. A., Li, D., Marschall, T., McMichael, J. F., Novak, A. M., Purushotham, D., Schneider, V. A., Schultz, B. I., Smith, M. W., Sofia, H. J., Weissman, T., Flicek, P., Li, H., Miga, K. H., Paten, B., Jarvis, E. D., Hall, I. M., Eichler, E. E., Haussler, D., Human Pangenome Reference Consortium 2022; 604 (7906): 437-446

Abstract

The human reference genome is the most widely used resource in human genetics and is due for a major update. Its current structure is a linear composite of merged haplotypes from more than 20 people, with a single individual comprising most of the sequence. It contains biases and errors within a framework that does not represent global human genomic variation. A high-quality reference with global representation of common variants, including single-nucleotide variants, structural variants and functional elements, is needed. The Human Pangenome Reference Consortium aims to create a more sophisticated and complete human reference genome with a graph-based, telomere-to-telomere representation of global genomic diversity. Here we leverage innovations in technology, study design and global partnerships with the goalof constructing the highest-possible quality human pangenome reference. Our goal is toimprove data representation and streamline analyses to enable routine assembly of complete diploid genomes. With attention to ethical frameworks, the human pangenome reference will contain a more accurate and diverse representation of global genomic variation, improve gene-disease association studies across populations, expand the scope of genomics research to the most repetitive and polymorphic regions of the genome, and serve as the ultimate genetic resource for future biomedical research and precision medicine.

View details for DOI 10.1038/s41586-022-04601-8

View details for PubMedID 35444317
A deep learning model for molecular label transfer that enables cancer cell identification from histopathology images. NPJ precision oncology Su, A., Lee, H., Tan, X., Suarez, C. J., Andor, N., Nguyen, Q., Ji, H. P. 2022; 6 (1): 14

Abstract

Deep-learning classification systems have the potential to improve cancer diagnosis. However, development of these computational approaches so far depends on prior pathological annotations and large training datasets. The manual annotation is low-resolution, time-consuming, highly variable and subject to observer variance. To address this issue, we developed a method, H&E Molecular neural network (HEMnet). HEMnet utilizes immunohistochemistry as an initial molecular label for cancer cells on a H&E image and trains a cancer classifier on the overlapping clinical histopathological images. Using this molecular transfer method, HEMnet successfully generated and labeled 21,939 tumor and 8782 normal tiles from ten whole-slide images for model training. After building the model, HEMnet accurately identified colorectal cancer regions, which achieved 0.84 and 0.73 of ROC AUC values compared to p53 staining and pathological annotations, respectively. Our validation study using histopathology images from TCGA samples accurately estimated tumor purity, which showed a significant correlation (regression coefficient of 0.8) with the estimation based on genomic sequencing data. Thus, HEMnet contributes to addressing two main challenges in cancer deep-learning analysis, namely the need to have a large number of images for training and the dependence on manual labeling by a pathologist. HEMnet also predicts cancer cells at a much higher resolution compared to manual histopathologic evaluation. Overall, our method provides a path towards a fully automated delineation of any type of tumor so long as there is a cancer-oriented molecular stain available for subsequent learning. Software, tutorials and interactive tools are available at: https://github.com/BiomedicalMachineLearning/HEMnet.

View details for DOI 10.1038/s41698-022-00252-0

View details for PubMedID 35236916
Analysis of 16S rRNA sequencing in advanced colorectal cancer tissue samples An, H., Partha, M. A., Lee, H., Lau, B., Shin, G., Almeda, A., Ji, H. P. LIPPINCOTT WILLIAMS & WILKINS. 2022

View details for DOI 10.1200/JCO.2022.40.4_suppl.163

View details for Web of Science ID 000770995900159
Single-cell characterization of CRISPR-modified transcript isoforms with nanopore sequencing. Genome biology Kim, H. S., Grimes, S. M., Hooker, A. C., Lau, B. T., Ji, H. P. 2021; 22 (1): 331

Abstract

We developed a single-cell approach to detect CRISPR-modified mRNA transcript structures. This method assesses how genetic variants at splicing sites and splicing factors contribute to alternative mRNA isoforms. We determine how alternative splicing is regulated by editing target exon-intron segments or splicing factors by CRISPR-Cas9 and their consequences on transcriptome profile. Our method combines long-read sequencing to characterize the transcript structure and short-read sequencing to match the single-cell gene expression profiles and gRNA sequence and therefore provides targeted genomic edits and transcript isoform structure detection at single-cell resolution.

View details for DOI 10.1186/s13059-021-02554-1

View details for PubMedID 34872615
Characterization of the consensus mucosal microbiome of colorectal cancer. NAR cancer Zhao, L., Grimes, S. M., Greer, S. U., Kubit, M., Lee, H., Nadauld, L. D., Ji, H. P. 1800; 3 (4): zcab049

Abstract

Dysbioisis is an imbalance of an organ's microbiome and plays a role in colorectal cancer pathogenesis. Characterizing the bacteria in the microenvironment of a cancer through genome sequencing has advantages compared to culture-based profiling. However, there are notable technical and analytical challenges in characterizing universal features of tumor microbiomes. Colorectal tumors demonstrate microbiome variation among different studies and across individual patients. To address these issues, we conducted a computational study to determine a consensus microbiome for colorectal cancer, analyzing 924 tumors from eight independent RNA-Seq data sets. A standardized meta-transcriptomic analysis pipeline was established with quality control metrics. Microbiome profiles across different cohorts were compared and recurrently altered microbial shifts specific to colorectal cancer were determined. We identified cancer-specific set of 114 microbial species associated with tumors that were found among all investigated studies. Firmicutes, Bacteroidetes, Proteobacteria and Actinobacteria were among the four most abundant phyla for the colorectal cancer microbiome. Member species of Clostridia were depleted and Fusobacterium nucleatum was one of the most enriched bacterial species in tumors. Associations between the consensus species and specific immune cell types were noted. Our results are available as a web data resource for other researchers to explore (https://crc-microbiome.stanford.edu).

View details for DOI 10.1093/narcan/zcab049

View details for PubMedID 34988460
In Situ Vaccination Induces Changes in Follicular Lymphoma Tumor Cells That Correlate with Abscopal Clinical Regressions Haebe, S., Shree, T., Day, G., Sathe, A., Czerwinski, D. K., Grimes, S. M., Long, S. R., Martin, B., Hoppe, R., Ji, H. P., Levy, R. AMER SOC HEMATOLOGY. 2021

View details for DOI 10.1182/blood-2021-154115

View details for Web of Science ID 000736413901138
Therapeutic and Immunologic Responses Elicited By in Situ Vaccination with CpG, Ibrutinib, and Low-Dose Radiation Shree, T., Haebe, S., Czerwinski, D. K., Day, G., Sathe, A., Khodadoust, M. S., Frank, M. J., Beygi, S., Hoppe, R., Long, S. R., Martin, B., Ji, H. P., Levy, R. AMER SOC HEMATOLOGY. 2021

View details for DOI 10.1182/blood-2021-154017

View details for Web of Science ID 000736413906106
Single-Cell Transcriptomic Analysis of a Patient with Metastatic Appendiceal Adenocarcinoma: A Stem or Crypt Cell-Like Neoplasm? Ayala, C., Grimes, S. M., Lee, B., Ji, H. ELSEVIER SCIENCE INC. 2021: S240-S241

View details for Web of Science ID 000718303100456
A Predictive Model to Guide Brain MRI Surveillance in Patients With Metastatic Lung Cancer: Impact on Real World Outcomes Wu, J., Ding, V., Luo, S., Choi, E., Hellyer, J., Myall, N., Henry, S., Wood, D., Stehr, H., Ji, H., Nagpal, S., Gephart, M., Wakelee, H., Neal, J., Han, S. ELSEVIER SCIENCE INC. 2021: S1177

View details for Web of Science ID 000709606500645
Profiling diverse sequence tandem repeats in colorectal cancer reveals co-occurrence of microsatellite and chromosomal instability involving Chromosome 8. Genome medicine Shin, G., Greer, S. U., Hopmans, E., Grimes, S. M., Lee, H., Zhao, L., Miotke, L., Suarez, C., Almeda, A. F., Haraldsdottir, S., Ji, H. P. 2021; 13 (1): 145

Abstract

We developed a sensitive sequencing approach that simultaneously profiles microsatellite instability, chromosomal instability, and subclonal structure in cancer. We assessed diverse repeat motifs across 225 microsatellites on colorectal carcinomas. Our study identified elevated alterations at both selected tetranucleotide and conventional mononucleotide repeats. Many colorectal carcinomas had a mix of genomic instability states that are normally considered exclusive. An MSH3 mutation may have contributed to the mixed states. Increased copy number of chromosome arm 8q was most prevalent among tumors with microsatellite instability, including a case of translocation involving 8q. Subclonal analysis identified co-occurring driver mutations previously known to be exclusive.

View details for DOI 10.1186/s13073-021-00958-z

View details for PubMedID 34488871
Patient-derived ex vivo TME-models and single-cell sequencing reveal transcriptional responses to immunotherapy. Sathe, A., Chen, J., Grimes, S. M., Ayala, C. I., Poultsides, G., Ji, H. P. AMER ASSOC CANCER RESEARCH. 2021

View details for Web of Science ID 000680263504448
New Approaches to Moderate CRISPR-Cas9 Activity: Addressing Issues of Cellular Uptake and Endosomal Escape. Molecular therapy : the journal of the American Society of Gene Therapy van Hees, M., Slott, S., Hansen, A. H., Kim, H. S., Ji, H. P., Astakhova, K. 2021

Abstract

CRISPR-Cas9 is rapidly entering molecular biology and biomedicine as a promising gene-editing tool. A unique feature of CRISPR-Cas9 is a single guide RNA directing a Cas9 nuclease towards its genomic target. Herein, we highlight new approaches for improving cellular uptake and endosomal escape of CRISPR-Cas9. As opposed to other recently published works, this review is focused on non-viral carriers as a means to facilitate the cellular uptake of CRISPR-Cas9 through endocytosis. The majority of non-viral carriers, such as gold nanoparticles, polymer nanoparticles, lipid nanoparticles and nanoscale zeolitic imidazole frameworks, are developed with a focus towards optimizing the endosomal escape of CRISPR-Cas9 by taking advantage of the acidic environment in the late endosomes. Among the most broadly used methods for in vitro and ex vivo ribonucleotide protein transfection are electroporation and microinjection. Thus, other delivery formats are warranted for in vivo delivery of CRISPR-Cas9. Herein, we specifically revise the use of peptide and nanoparticle-based systems as platforms for CRISPR-Cas9 delivery in vivo. Finally, we highlight future perspectives of the CRISPR-Cas9 gene-editing tool and the prospects of using non-viral vectors to improve its bioavailability and therapeutic potential.

View details for DOI 10.1016/j.ymthe.2021.06.003

View details for PubMedID 34091053
Integrative single-cell analysis of allele-specific copy number alterations and chromatin accessibility in cancer. Nature biotechnology Wu, C., Lau, B. T., Kim, H. S., Sathe, A., Grimes, S. M., Ji, H. P., Zhang, N. R. 2021

Abstract

Cancer progression is driven by both somatic copy number aberrations (CNAs) and chromatin remodeling, yet little is known about the interplay between these two classes of events in shaping the clonal diversity of cancers. We present Alleloscope, a method for allele-specific copy number estimation that can be applied to single-cell DNA- and/or transposase-accessible chromatin-sequencing (scDNA-seq, ATAC-seq) data, enabling combined analysis of allele-specific copy number and chromatin accessibility. On scDNA-seq data from gastric, colorectal and breast cancer samples, with validation using matched linked-read sequencing, Alleloscope finds pervasive occurrence of highly complex, multiallelic CNAs, in which cells that carry varying allelic configurations adding to the same total copy number coevolve within a tumor. On scATAC-seq from two basal cell carcinoma samples and a gastric cancer cell line, Alleloscope detected multiallelic copy number events and copy-neutral loss-of-heterozygosity, enabling dissection of the contributions of chromosomal instability and chromatin remodeling to tumor evolution.

View details for DOI 10.1038/s41587-021-00911-w

View details for PubMedID 34017141
Profiling SARS-CoV-2 mutation fingerprints that range from the viral pangenome to individual infection quasispecies. Genome medicine Lau, B. T., Pavlichin, D., Hooker, A. C., Almeda, A., Shin, G., Chen, J., Sahoo, M. K., Huang, C. H., Pinsky, B. A., Lee, H. J., Ji, H. P. 2021; 13 (1): 62

Abstract

BACKGROUND: The genome of SARS-CoV-2 is susceptible to mutations during viral replication due to the errors generated by RNA-dependent RNA polymerases. These mutations enable the SARS-CoV-2 to evolve into new strains. Viral quasispecies emerge from de novo mutations that occur in individual patients. In combination, these sets of viral mutations provide distinct genetic fingerprints that reveal the patterns of transmission and have utility in contact tracing.METHODS: Leveraging thousands of sequenced SARS-CoV-2 genomes, we performed a viral pangenome analysis to identify conserved genomic sequences. We used a rapid and highly efficient computational approach that relies on k-mers, short tracts of sequence, instead of conventional sequence alignment. Using this method, we annotated viral mutation signatures that were associated with specific strains. Based on these highly conserved viral sequences, we developed a rapid and highly scalable targeted sequencing assay to identify mutations, detect quasispecies variants, and identify mutation signatures from patients. These results were compared to the pangenome genetic fingerprints.RESULTS: We built a k-mer index for thousands of SARS-CoV-2 genomes and identified conserved genomics regions and landscape of mutations across thousands of virus genomes. We delineated mutation profiles spanning common genetic fingerprints (the combination of mutations in a viral assembly) and a combination of mutations that appear in only a small number of patients. We developed a targeted sequencing assay by selecting primers from the conserved viral genome regions to flank frequent mutations. Using a cohort of 100 SARS-CoV-2 clinical samples, we identified genetic fingerprints consisting of strain-specific mutations seen across populations and de novo quasispecies mutations localized to individual infections. We compared the mutation profiles of viral samples undergoing analysis with the features of the pangenome.CONCLUSIONS: We conducted an analysis for viral mutation profiles that provide the basis of genetic fingerprints. Our study linked pangenome analysis with targeted deep sequenced SARS-CoV-2 clinical samples. We identified quasispecies mutations occurring within individual patients and determined their general prevalence when compared to over 70,000 other strains. Analysis of these genetic fingerprints may provide a way of conducting molecular contact tracing.

View details for DOI 10.1186/s13073-021-00882-2

View details for PubMedID 33875001
An expanded universe of cancer targets. Cell Hahn, W. C., Bader, J. S., Braun, T. P., Califano, A., Clemons, P. A., Druker, B. J., Ewald, A. J., Fu, H., Jagu, S., Kemp, C. J., Kim, W., Kuo, C. J., McManus, M., B Mills, G., Mo, X., Sahni, N., Schreiber, S. L., Talamas, J. A., Tamayo, P., Tyner, J. W., Wagner, B. K., Weiss, W. A., Gerhard, D. S., Cancer Target Discovery and Development Network, Dancik, V., Gill, S., Hua, B., Sharifnia, T., Viswanathan, V., Zou, Y., Dela Cruz, F., Kung, A., Stockwell, B., Boehm, J., Dempster, J., Manguso, R., Vazquez, F., Cooper, L. A., Du, Y., Ivanov, A., Lonial, S., Moreno, C. S., Niu, Q., Owonikoko, T., Ramalingam, S., Reyna, M., Zhou, W., Grandori, C., Shmulevich, I., Swisher, E., Cai, J., Chan, I. S., Dunworth, M., Ge, Y., Georgess, D., Grasset, E. M., Henriet, E., Knutsdottir, H., Lerner, M. G., Padmanaban, V., Perrone, M. C., Suhail, Y., Tsehay, Y., Warrier, M., Morrow, Q., Nechiporuk, T., Long, N., Saultz, J., Kaempf, A., Minnier, J., Tognon, C. E., Kurtz, S. E., Agarwal, A., Brown, J., Watanabe-Smith, K., Vu, T. Q., Jacob, T., Yan, Y., Robinson, B., Lind, E. F., Kosaka, Y., Demir, E., Estabrook, J., Grzadkowski, M., Nikolova, O., Chen, K., Deneen, B., Liang, H., Bassik, M. C., Bhattacharya, A., Brennan, K., Curtis, C., Gevaert, O., Ji, H. P., Karlsson, K. A., Karagyozova, K., Lo, Y., Liu, K., Nakano, M., Sathe, A., Smith, A. R., Spees, K., Wong, W. H., Yuki, K., Hangauer, M., Kaufman, D. S., Balmain, A., Bollam, S. R., Chen, W., Fan, Q., Kersten, K., Krummel, M., Li, Y. R., Menard, M., Nasholm, N., Schmidt, C., Serwas, N. K., Yoda, H. 2021; 184 (5): 1142–55

Abstract

The characterization of cancer genomes has provided insight into somatically altered genes across tumors, transformed our understanding of cancer biology, and enabled tailoring of therapeutic strategies. However, the function of most cancer alleles remains mysterious, and many cancer features transcend their genomes. Consequently, tumor genomic characterization does not influence therapy for most patients. Approaches to understand the function and circuitry of cancer genes provide complementary approaches to elucidate both oncogene and non-oncogene dependencies. Emerging work indicates that the diversity of therapeutic targets engendered by non-oncogene dependencies is much larger than the list of recurrently mutated genes. Here we describe a framework for this expanded list of cancer targets, providing novel opportunities for clinical translation.

View details for DOI 10.1016/j.cell.2021.02.020

View details for PubMedID 33667368
Goblet Cell Origins of Human Appendiceal Mucinous Neoplasms and Pseudomyxoma Peritonei Tumors Ayala-Navarro, C., Grimes, S., Sathe, A., Bai, X., Poultsides, G., Lee, B., Ji, H. SPRINGER. 2021: S30–S31

View details for Web of Science ID 000627388000068
Single Cell Analysis Can Define Distinct Evolution of Tumor Sites in Follicular Lymphoma. Blood Haebe, S. E., Shree, T. n., Sathe, A. n., Day, G. n., Czerwinski, D. K., Grimes, S. n., Lee, H. n., Binkley, M. S., Long, S. R., Martin, B. A., Ji, H. P., Levy, R. n. 2021

Abstract

Tumor heterogeneity complicates biomarker development and fosters drug resistance in solid malignancies. In lymphoma, our knowledge of site-to-site heterogeneity and its clinical implications is still limited. Here, we profiled two nodal, synchronously-acquired tumor samples from ten follicular lymphoma patients using single cell RNA, B cell receptor (BCR) and T cell receptor sequencing, and flow cytometry. By following the rapidly mutating tumor immunoglobulin genes, we discovered that BCR subclones were shared between the two tumor sites in some patients, but in many patients the disease had evolved separately with limited tumor cell migration between the sites. Patients exhibiting divergent BCR evolution also exhibited divergent tumor gene expression and cell surface protein profiles. While the overall composition of the tumor microenvironment did not differ significantly between sites, we did detect a specific correlation between site-to-site tumor heterogeneity and T follicular helper (Tfh) cell abundance. We further observed enrichment of particular ligand-receptor pairs between tumor and Tfh cells, including CD40 and CD40LG, and a significant correlation between tumor CD40 expression and Tfh proliferation. Our study may explain discordant responses to systemic therapies, underscores the difficulty of capturing a patient's disease with a single biopsy, and furthers our understanding of tumor-immune networks in follicular lymphoma.

View details for DOI 10.1182/blood.2020009855

View details for PubMedID 33728464
Pepsinogens and Gastrin Demonstrate Low Discrimination for Gastric Precancerous Lesions in a Multi-Ethnic United States Cohort. Clinical gastroenterology and hepatology : the official clinical practice journal of the American Gastroenterological Association Huang, R. n., Park, S. n., Shen, J. n., Longacre, T. n., Ji, H. n., Hwang, J. H. 2021

View details for DOI 10.1016/j.cgh.2021.01.009

View details for PubMedID 33434656
Unique k-mer sequences for validating cancer-related substitution, insertion and deletion mutations. NAR cancer Lee, H., Shuaibi, A., Bell, J. M., Pavlichin, D. S., Ji, H. P. 2020; 2 (4): zcaa034

Abstract

Cancer genome sequencing has led to important discoveries such as the identification of cancer genes. However, challenges remain in the analysis of cancer genome sequencing. One significant issue is that mutations identified by multiple variant callers are frequently discordant even when using the same genome sequencing data. For insertion and deletion mutations, oftentimes there is no agreement among different callers. Identifying somatic mutations involves read mapping and variant calling, a complicated process that uses many parameters and model tuning. To validate the identification of true mutations, we developed a method using k-mer sequences. First, we characterized the landscape of unique versus non-unique k-mers in the human genome. Second, we developed a software package, KmerVC, to validate the given somatic mutations from sequencing data. Our program validates the occurrence of a mutation based on statistically significant difference in frequency of k-mers with and without a mutation from matched normal and tumor sequences. Third, we tested our method on both simulated and cancer genome sequencing data. Counting k-mer involving mutations effectively validated true positive mutations including insertions and deletions across different individual samples in a reproducible manner. Thus, we demonstrated a straightforward approach for rapidly validating mutations from cancer genome sequencing data.

View details for DOI 10.1093/narcan/zcaa034

View details for PubMedID 33345188
SPATIAL SINGLE-CELL ANALYSIS OF COLORECTAL CANCER TUMOUR USING MULTIPLEXED IMAGING MASS CYTOMETRY Minh Tran, Su, A., Lee, H., Cruz, R., Pflieger, L., Dean, A., Quan Nguyen, Ji, H., Rhodes, T. BMJ PUBLISHING GROUP. 2020: A399

View details for DOI 10.1136/jitc-2020-SITC2020.0665

View details for Web of Science ID 000616665301184
IDENTIFY IMMUNE CELL TYPES AND BIOMARKERS ASSOCIATED WITH IMMUNE-RELATED ADVERSE EVENTS USING SINGLE CELL RNA SEQUENCING Chen, J., Pflieger, L., Grimes, S., Baker, T., Brems, M., Fulde, G., Snow, S., Howe, P., Sathe, A., Christensen, B., Ji, H., Rhodes, T. BMJ PUBLISHING GROUP. 2020: A39

View details for DOI 10.1136/jitc-2020-SITC2020.0062

View details for Web of Science ID 000616665300062
The COVID-19 XPRIZE and the need for scalable, fast, and widespread testing. Nature biotechnology MacKay, M. J., Hooker, A. C., Afshinnekoo, E., Salit, M., Kelly, J., Feldstein, J. V., Haft, N., Schenkel, D., Nambi, S., Cai, Y., Zhang, F., Church, G., Dai, J., Wang, C. L., Levy, S., Huber, J., Ji, H. P., Kriegel, A., Wyllie, A. L., Mason, C. E. 2020

View details for DOI 10.1038/s41587-020-0655-4

View details for PubMedID 32820257
A Summary of the 2020 Gastric Cancer Summit at Stanford University. Gastroenterology Huang, R. J., Koh, H., Hwang, J. H., Summit Leaders, Abnet, C. C., Alarid-Escudero, F., Amieva, M. R., Bruce, M. G., Camargo, M. C., Chan, A. T., Choi, I. J., Corvalan, A., Davis, J. L., Deapen, D., Epplein, M., Greenwald, D. A., Hamashima, C., Hur, C., Inadomi, J. M., Ji, H. P., Jung, H., Lee, E., Lin, B., Palaniappan, L. P., Parsonnet, J., Peek, R. M., Piazuelo, M. B., Rabkin, C. S., Shah, S. C., Smith, A., So, S., Stoffel, E. M., Umar, A., Wilson, K. T., Woo, Y., Yeoh, K. G. 2020

View details for DOI 10.1053/j.gastro.2020.05.100

View details for PubMedID 32707045
CRISPRpic: fast and precise analysis for CRISPR-induced mutations via prefixed index counting. NAR genomics and bioinformatics Lee, H., Chang, H. Y., Cho, S. W., Ji, H. P. 2020; 2 (2): lqaa012

Abstract

Analysis of CRISPR-induced mutations at targeted locus can be achieved by polymerase chain reaction amplification followed by parallel massive sequencing. We developed a novel algorithm, named as CRISPRpic, to analyze the sequencing reads for the CRISPR experiments via counting exact-matching and pattern-searching. Compare to the other methods based on sequence alignment, CRISPRpic provides precise mutation calling and ultrafast analysis of the sequencing results. Python script of CRISPRpic is available at https://github.com/compbio/CRISPRpic.

View details for DOI 10.1093/nargab/lqaa012

View details for PubMedID 32118203
Entire landscape of epitopes from all possible missense mutations in human coding sequences. Lee, H., Greer, S., Ji, H. P. AMER ASSOC CANCER RESEARCH. 2020: 118–19

View details for Web of Science ID 000522837200195
Identify biomarkers associated with immunotoxicities using single-cell RNAseq. Chen, J., Pflieger, L., Sathe, A., Grimes, S., Brems, M., Pattison, T., Christensen, B., Rhodes, T., Ji, H. AMER ASSOC CANCER RESEARCH. 2020: 32

View details for Web of Science ID 000518188200037
Comparative Genomic Analysis of High Grade Neuroendocrine Neoplasms across Diverse Organs Sun, T. Y., Van Hummelen, P., Martin, B., Xia, C., Zhao, L., Hornbacker, K., Lee, H., Ji, H., Kunz, P. KARGER. 2020: 51

View details for Web of Science ID 000522166100052
Comprehensive genomic sequencing of high-grade neuroendocrine neoplasms Sun, T., Van Hummelen, P., Martin, B., Xia, C., Lee, H., Zhao, L., Hornbacker, K., Ji, H., Kunz, P. L. AMER SOC CLINICAL ONCOLOGY. 2020

View details for Web of Science ID 000530922700602
Gastric Cancer Registry: A comprehensive patient-reported resource for multidisciplinary and translational genomic approaches to gastric cancer Almeda, A., Hooker, A., Lee, H., Mills, M., Van Hummelen, P., Ford, J. M., Ji, H. AMER SOC CLINICAL ONCOLOGY. 2020

View details for Web of Science ID 000530922700413
Strain-resolved microbiome sequencing reveals mobile elements that drive bacterial competition on a clinical timescale. Genome medicine Zlitni, S. n., Bishara, A. n., Moss, E. L., Tkachenko, E. n., Kang, J. B., Culver, R. N., Andermann, T. M., Weng, Z. n., Wood, C. n., Handy, C. n., Ji, H. P., Batzoglou, S. n., Bhatt, A. S. 2020; 12 (1): 50

Abstract

Populations of closely related microbial strains can be simultaneously present in bacterial communities such as the human gut microbiome. We recently developed a de novo genome assembly approach that uses read cloud sequencing to provide more complete microbial genome drafts, enabling precise differentiation and tracking of strain-level dynamics across metagenomic samples. In this case study, we present a proof-of-concept using read cloud sequencing to describe bacterial strain diversity in the gut microbiome of one hematopoietic cell transplantation patient over a 2-month time course and highlight temporal strain variation of gut microbes during therapy. The treatment was accompanied by diet changes and administration of multiple immunosuppressants and antimicrobials.We conducted short-read and read cloud metagenomic sequencing of DNA extracted from four longitudinal stool samples collected during the course of treatment of one hematopoietic cell transplantation (HCT) patient. After applying read cloud metagenomic assembly to discover strain-level sequence variants in these complex microbiome samples, we performed metatranscriptomic analysis to investigate differential expression of antibiotic resistance genes. Finally, we validated predictions from the genomic and metatranscriptomic findings through in vitro antibiotic susceptibility testing and whole genome sequencing of isolates derived from the patient stool samples.During the 56-day longitudinal time course that was studied, the patient's microbiome was profoundly disrupted and eventually dominated by Bacteroides caccae. Comparative analysis of B. caccae genomes obtained using read cloud sequencing together with metagenomic RNA sequencing allowed us to identify differences in substrain populations over time. Based on this, we predicted that particular mobile element integrations likely resulted in increased antibiotic resistance, which we further supported using in vitro antibiotic susceptibility testing.We find read cloud assembly to be useful in identifying key structural genomic strain variants within a metagenomic sample. These strains have fluctuating relative abundance over relatively short time periods in human microbiomes. We also find specific structural genomic variations that are associated with increased antibiotic resistance over the course of clinical treatment.

View details for DOI 10.1186/s13073-020-00747-0

View details for PubMedID 32471482
One Size Does Not Fit All: Marked Heterogeneity in Incidence of and Survival from Gastric Cancer among Asian American Subgroups. Cancer epidemiology, biomarkers & prevention : a publication of the American Association for Cancer Research, cosponsored by the American Society of Preventive Oncology Huang, R. J., Sharp, N. n., Talamoa, R. O., Ji, H. P., Hwang, J. H., Palaniappan, L. P. 2020

Abstract

Asian Americans are at higher risk for non-cardia gastric cancers (NCGCs) relative to non-Hispanic Whites (NHWs). Asian Americans are genetically, linguistically, and culturally heterogeneous, yet have mostly been treated as a single population in prior studies. This aggregation may obscure important subgroup-specific cancer patterns.We utilized data from 13 regional United States cancer registries from 1990-2014 to determine secular trends in incidence and survivorship from NCGC. Data were analyzed for NHWs and the six largest Asian American subgroups: Chinese, Japanese, Filipino, Korean, Vietnamese, and South Asian (Indian/Pakistani).There exists substantial heterogeneity in NCGC incidence between Asian subgroups, with Koreans (48.6 per 100,000 person-years) having seven-fold higher age-adjusted incidence than South Asians (7.4 per 100,000 person-years). Asians had generally earlier stages of diagnosis and higher rates of surgical resection compared to NHWs. All Asian subgroups also demonstrated higher five-year observed survival compared to NHWs, with Koreans (41.3%) and South Asians (42.8%) having survival double that of NHWs (20.1%, p<0.001). In multivariable regression, differences in stage of diagnosis and rates of resection partially explained the difference in survivorship between Asian subgroups.We find substantial differences in incidence, staging, histology, treatment, and survivorship from NCGC between Asian subgroups, data which challenge our traditional perceptions about gastric cancer in Asians. Both biological heterogeneity and cultural/environmental differences may underlie these findings.These data are relevant to the national discourse regarding the appropriate role of gastric cancer screening, and identifies high-risk racial/ethnic subgroups who many benefit from customized risk attenuation programs.

View details for DOI 10.1158/1055-9965.EPI-19-1482

View details for PubMedID 32152216
Single cell genomic characterization reveals the cellular reprogramming of the gastric tumor microenvironment. Clinical cancer research : an official journal of the American Association for Cancer Research Sathe, A. n., Grimes, S. M., Lau, B. T., Chen, J. n., Suarez, C. n., Huang, R. J., Poultsides, G. A., Ji, H. P. 2020

Abstract

The tumor microenvironment (TME) consists of a heterogenous cellular milieu that can influence cancer cell behavior. Its characteristics havean impact on treatments such as immunotherapy. These features can be revealed with single-cell RNA sequencing (scRNA-seq). We hypothesized that scRNA-seq analysis ofgastric cancer (GC) together with paired normal tissue and peripheral blood mononuclear cells (PBMCs) would identify critical elements of cellular deregulation not apparent with other approaches.scRNA-seq was conducted on seven patients with GC and one patient with intestinal metaplasia. We sequenced 56,167 cells comprising GC (32,407 cells), paired normal tissue (18,657 cells) and PBMCs (5,103 cells). Protein expression was validated by multiplex immunofluorescence.Tumor epithelium had copy number alterations, a distinct gene expression program from normal, with intra-tumor heterogeneity. GC TME was significantly enriched for stromal cells, macrophages, dendritic cells (DCs) and Tregs. TME-exclusive stromal cells expressed distinct extracellular matrix components than normal. Macrophages were transcriptionally heterogenous and did not conform to a binary M1/M2 paradigm. Tumor-DCs had a unique gene expression program compared to PBMC DCs. TME-specific cytotoxic T cells were exhausted with two heterogenous subsets. Helper, cytotoxic T, Treg and NK cells expressed multiple immune checkpoint or costimulatory molecules. Receptor-ligand analysis revealed TME-exclusive inter-cellular communication.Single-cell gene expression studies revealed widespread reprogramming across multiple cellular elements in the GC TME. Cellular remodeling was delineated by changes in cell numbers, transcriptional states and inter-cellular interactions. This characterization facilitates understanding of tumor biology and enables identification of novel targets including for immunotherapy.

View details for DOI 10.1158/1078-0432.CCR-19-3231

View details for PubMedID 32060101
Joint single cell DNA-seq and RNA-seq of gastric cancer cell lines reveals rules of in vitro evolution. NAR genomics and bioinformatics Andor, N. n., Lau, B. T., Catalanotti, C. n., Sathe, A. n., Kubit, M. n., Chen, J. n., Blaj, C. n., Cherry, A. n., Bangs, C. D., Grimes, S. M., Suarez, C. J., Ji, H. P. 2020; 2 (2): lqaa016

Abstract

Cancer cell lines are not homogeneous nor are they static in their genetic state and biological properties. Genetic, transcriptional and phenotypic diversity within cell lines contributes to the lack of experimental reproducibility frequently observed in tissue-culture-based studies. While cancer cell line heterogeneity has been generally recognized, there are no studies which quantify the number of clones that coexist within cell lines and their distinguishing characteristics. We used a single-cell DNA sequencing approach to characterize the cellular diversity within nine gastric cancer cell lines and integrated this information with single-cell RNA sequencing. Overall, we sequenced the genomes of 8824 cells, identifying between 2 and 12 clones per cell line. Using the transcriptomes of more than 28 000 single cells from the same cell lines, we independently corroborated 88% of the clonal structure determined from single cell DNA analysis. For one of these cell lines, we identified cell surface markers that distinguished two subpopulations and used flow cytometry to sort these two clones. We identified substantial proportions of replicating cells in each cell line, assigned these cells to subclones detected among the G0/G1 population and used the proportion of replicating cells per subclone as a surrogate of each subclone's growth rate.

View details for DOI 10.1093/nargab/lqaa016

View details for PubMedID 32215369

View details for PubMedCentralID PMC7079336
OVERCOMING HIGH NANOPORE BASECALLER ERROR RATES FOR DNA STORAGE VIA BASECALLER-DECODER INTEGRATION AND CONVOLUTIONAL CODES Chandak, S., Neu, J., Tatwawadi, K., Mardia, J., Lau, B., Kubit, M., Hulett, R., Griffin, P., Wootters, M., Weissman, T., Ji, H., IEEE IEEE. 2020: 8822–26

View details for Web of Science ID 000615970409020
Whole genome analysis identifies the association of TP53 genomic deletions with lower survival in Stage III colorectal cancer. Scientific reports Xia, L. C., Van Hummelen, P. n., Kubit, M. n., Lee, H. n., Bell, J. M., Grimes, S. M., Wood-Bouwens, C. n., Greer, S. U., Barker, T. n., Haslem, D. S., Ford, J. M., Fulde, G. n., Ji, H. P., Nadauld, L. D. 2020; 10 (1): 5009

Abstract

DNA copy number aberrations (CNA) are frequently observed in colorectal cancers (CRC). There is an urgent need for CNA-based biomarkers in clinics,. n For Stage III CRC, if combined with imaging or pathologic evidence, these markers promise more precise care. We conducted this Stage III specific biomarker discovery with a cohort of 134 CRCs, and with a newly developed high-efficiency CNA profiling protocol. Specifically, we developed the profiling protocol for tumor-normal matched tissue samples based on low-coverage clinical whole-genome sequencing (WGS). We demonstrated the protocol's accuracy and robustness by a systematic benchmark with microarray, high-coverage whole-exome and -genome approaches, where the low-coverage WGS-derived CNA segments were highly accordant (PCC >0.95) with those derived from microarray, and they were substantially less variable if compared to exome-derived segments. A lasso-based model and multivariate cox regression analysis identified a chromosome 17p loss, containing the TP53 tumor suppressor gene, that was significantly associated with reduced survival (P = 0.0139, HR = 1.688, 95% CI = [1.112-2.562]), which was validated by an independent cohort of 187 Stage III CRCs. In summary, this low-coverage WGS protocol has high sensitivity, high resolution and low cost and the identified 17p-loss is an effective poor prognosis marker for Stage III patients.

View details for DOI 10.1038/s41598-020-61643-6

View details for PubMedID 32193467
Ultra-fast detection and quantification of nucleic acids by amplification-free fluorescence assay. The Analyst Uhd, J. n., Miotke, L. n., Ji, H. P., Dunaeva, M. n., Pruijn, G. J., Jørgensen, C. D., Kristoffersen, E. L., Birkedal, V. n., Yde, C. W., Nielsen, F. C., Hansen, J. n., Astakhova, K. n. 2020

Abstract

Two types of clinically important nucleic acid biomarkers, microRNA (miRNA) and circulating tumor DNA (ctDNA) were detected and quantified from human serum using an amplification-free fluorescence hybridization assay. Specifically, miRNAs hsa-miR-223-3p and hsa-miR-486-5p with relevance for rheumatoid arthritis and cancer related mutations BRAF and KRAS of ctDNA were directly measured. The required oligonucleotide probes for the assay were rationally designed and synthesized through a novel "clickable" approach which is time and cost-effective. With no need for isolating nucleic acid components from serum, the fluoresence-based assay took only 1 hour. Detection and absolute quantification of targets was successfully achieved despite their notoriously low abundance, with a precision down to individual nucleotides. Obtained miRNA and ctDNA amounts showed overall a good correlation with current techniques. With appropriate probes, our novel assay and signal boosting approach could become a useful tool for point-of-care measuring other low abundance nucleic acid biomarkers.

View details for DOI 10.1039/d0an00676a

View details for PubMedID 32648858
Site to Site Comparison of Follicular Lymphoma Biopsies By Single Cell RNA Sequencing Haebe, S., Shree, T., Sathe, A., Day, G., Lee, H., Czerwinski, D. K., Grimes, S., Ji, H., Levy, R. AMER SOC HEMATOLOGY. 2019

View details for DOI 10.1182/blood-2019-129445

View details for Web of Science ID 000518218500526
Dynamic Immune Modulation Seen By Single Cell RNA-Sequencing of Serial Lymphoma Biopsies in Patients Undergoing in Situ Vaccination Shree, T., Haebe, S., Sathe, A., Day, G., Lee, H., Czerwinski, D. K., Grimes, S., Ji, H., Levy, R. AMER SOC HEMATOLOGY. 2019

View details for DOI 10.1182/blood-2019-131684

View details for Web of Science ID 000577160402034
Structural variant analysis for linked-read sequencing data with gemtools BIOINFORMATICS Greer, S. U., Ji, H. P. 2019; 35 (21): 4397–99

View details for DOI 10.1093/bioinformatics/btz239

View details for Web of Science ID 000499323900027
Single cell RNA sequencing of serial tumor and blood biopsies from lymphoma patients undergoing in situ vaccination Shree, T., Sathe, A., Ji, H., Levy, R. AMER ASSOC CANCER RESEARCH. 2019

View details for DOI 10.1158/1538-7445.AM2019-4045

View details for Web of Science ID 000488279403441
Comprehensive characterization of gastric cancer at single-cell resolution Chen, J., Sathe, A., Grimes, S., Greer, S., Lau, B., Renschler, A., Poultsides, G., Suarez, C., Ji, H. AMER ASSOC CANCER RESEARCH. 2019

View details for DOI 10.1158/1538-7445.SABCS18-151

View details for Web of Science ID 000488129901333
iGRAMMy: Cloud-based characterization of microbial landscape in colorectal cancers Xia, L. C., Ai, D., Guo, M., Ji, H. AMER ASSOC CANCER RESEARCH. 2019

View details for DOI 10.1158/1538-7445.AM2019-5022

View details for Web of Science ID 000488279405371
Single cell RNA sequencing reveals multiple adaptive resistance mechanisms to regorafenib in colon cancer Sathe, A., Lau, B. T., Grimes, S., Greer, S., Ji, H. AMER ASSOC CANCER RESEARCH. 2019

View details for DOI 10.1158/1538-7445.SABCS18-2105

View details for Web of Science ID 000488279400102
A functional CRISPR/Cas9 screen identifies kinases that modulate FGFR inhibitor response in gastric cancer ONCOGENESIS Chen, J., Bell, J., Lau, B. T., Whittaker, T., Stapleton, D., Ji, H. P. 2019; 8

View details for DOI 10.1038/s41389-019-0145-z

View details for Web of Science ID 000467678200003
Structural variant analysis for linked-read sequencing data with gemtools. Bioinformatics (Oxford, England) Greer, S. U., Ji, H. P. 2019

Abstract

SUMMARY: Linked-read sequencing generates synthetic long reads which are useful for the detection and analysis of structural variants (SVs). The software associated with 10X Genomics linked-read sequencing, Long Ranger, generates the essential output files (BAM, VCF, SV BEDPE) necessary for downstream analyses. However, to perform downstream analyses requires the user to customize their own tools to handle the unique features of linked-read sequencing data. Here, we describe gemtools, a collection of tools for the downstream and in-depth analysis of structural variants from linked-read data. Gemtools uses the barcoded aligned reads and the Megabase-scale phase blocks to determine haplotypes of structural variant breakpoints and delineate complex breakpoint configurations at the resolution of single DNA molecules. The gemtools package is a suite of tools that provides the user with the flexibility to perform basic functions on their linked-read sequencing output in order to address even more questions.AVAILABILITY AND IMPLEMENTATION: The gemtools package is freely available for download at: https://github.com/sgreer77/gemtools.SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

View details for PubMedID 30938757
Single-cell transcriptome analysis identifies distinct cell types and niche signaling in a primary gastric organoid model. Scientific reports Chen, J., Lau, B. T., Andor, N., Grimes, S. M., Handy, C., Wood-Bouwens, C., Ji, H. P. 2019; 9 (1): 4536

Abstract

The diverse cellular milieu of the gastric tissue microenvironment plays a critical role in normal tissue homeostasis and tumor development. However, few cell culture model can recapitulate the tissue microenvironment and intercellular signaling in vitro. We used a primary tissue culture system to generate a murine p53 null gastric tissue model containing both epithelium and mesenchymal stroma. To characterize the microenvironment and niche signaling, we used single cell RNA sequencing (scRNA-Seq) to determine the transcriptomes of 4,391 individual cells. Based on specific markers, we identified epithelial cells, fibroblasts and macrophages in initial tissue explants during organoid formation. The majority of macrophages were polarized towards wound healing and tumor promotion M2-type. During the course of time, the organoids maintained both epithelial and fibroblast lineages with the features of immature mouse gastric stomach. We detected a subset of cells in both lineages expressing Lgr5, one of the stem cell markers. We examined the lineage-specific Wnt signaling activation, and identified that Rspo3 was specifically expressed in the fibroblast lineage, providing an endogenous source of the R-spondin to activate Wnt signaling. Our studies demonstrate that this primary tissue culture system enables one to study gastric tissue niche signaling and immune response in vitro.

View details for PubMedID 30872643
Single-cell transcriptome analysis identifies distinct cell types and niche signaling in a primary gastric organoid model SCIENTIFIC REPORTS Chen, J., Lau, B. T., Andor, N., Grimes, S. M., Handy, C., Wood-Bouwens, C., Ji, H. P. 2019; 9

View details for DOI 10.1038/s41598-019-40809-x

View details for Web of Science ID 000461159600013
Haplotype-resolved and integrated genome analysis of the cancer cell line HepG2. Nucleic acids research Zhou, B., Ho, S. S., Greer, S. U., Spies, N., Bell, J. M., Zhang, X., Zhu, X., Arthur, J. G., Byeon, S., Pattni, R., Saha, I., Huang, Y., Song, G., Perrin, D., Wong, W. H., Ji, H. P., Abyzov, A., Urban, A. E. 2019

Abstract

HepG2 is one of the most widely used human cancer cell lines in biomedical research and one of the main cell lines of ENCODE. Although the functional genomic and epigenomic characteristics of HepG2 are extensively studied, its genome sequence has never been comprehensively analyzed and higher order genomic structural features are largely unknown. The high degree of aneuploidy in HepG2 renders traditional genome variant analysis methods challenging and partially ineffective. Correct and complete interpretation of the extensive functional genomics data from HepG2 requires an understanding of the cell line's genome sequence and genome structure. Using a variety of sequencing and analysis methods, we identified a wide spectrum of genome characteristics in HepG2: copy numbers of chromosomal segments at high resolution, SNVs and Indels (corrected for aneuploidy), regions with loss of heterozygosity, phased haplotypes extending to entire chromosome arms, retrotransposon insertions and structural variants (SVs) including complex and somatic genomic rearrangements. A large number of SVs were phased, sequence assembled and experimentally validated. We re-analyzed published HepG2 datasets for allele-specific expression and DNA methylation and assembled an allele-specific CRISPR/Cas9 targeting map. We demonstrate how deeper insights into genomic regulatory complexity are gained by adopting a genome-integrated framework.

View details for PubMedID 30864654
Single-cell RNA-Seq of follicular lymphoma reveals malignant B-cell types and coexpression of T-cell immune checkpoints BLOOD Andor, N., Simonds, E. F., Czerwinski, D. K., Chen, J., Grimes, S. M., Wood-Bouwens, C., Zheng, G. X. Y., Kubit, M. A., Greer, S., Weiss, W. A., Levy, R., Ji, H. P. 2019; 133 (10): 1119–29

View details for DOI 10.1182/blood-2018-08-862292

View details for Web of Science ID 000461506600016
Comprehensive, integrated, and phased whole-genome analysis of the primary ENCODE cell line K562 GENOME RESEARCH Zhou, B., Ho, S. S., Greer, S. U., Zhu, X., Bell, J. M., Arthur, J. G., Spies, N., Zhang, X., Byeon, S., Pattni, R., Ben-Efraim, N., Haney, M. S., Haraksingh, R. R., Song, G., Ji, H. P., Perrin, D., Wong, W. H., Abyzov, A., Urban, A. E. 2019; 29 (3): 472–84

View details for DOI 10.1101/gr.234948.118

View details for Web of Science ID 000460119900014
Comprehensive, integrated, and phased whole-genome analysis of the primary ENCODE cell line K562. Genome research Zhou, B., Ho, S. S., Greer, S. U., Zhu, X., Bell, J. M., Arthur, J. G., Spies, N., Zhang, X., Byeon, S., Pattni, R., Ben-Efraim, N., Haney, M. S., Haraksingh, R. R., Song, G., Ji, H. P., Perrin, D., Wong, W. H., Abyzov, A., Urban, A. E. 2019

Abstract

K562 is widely used in biomedical research. It is one of three tier-one cell lines of ENCODE and also most commonly used for large-scale CRISPR/Cas9 screens. Although its functional genomic and epigenomic characteristics have been extensively studied, its genome sequence and genomic structural features have never been comprehensively analyzed. Such information is essential for the correct interpretation and understanding of the vast troves of existing functional genomics and epigenomics data for K562. We performed and integrated deep-coverage whole-genome (short-insert), mate-pair, and linked-read sequencing as well as karyotyping and array CGH analysis to identify a wide spectrum of genome characteristics in K562: copy numbers (CN) of aneuploid chromosome segments at high-resolution, SNVs and indels (both corrected for CN in aneuploid regions), loss of heterozygosity, megabase-scale phased haplotypes often spanning entire chromosome arms, structural variants (SVs), including small and large-scale complex SVs and nonreference retrotransposon insertions. Many SVs were phased, assembled, and experimentally validated. We identified multiple allele-specific deletions and duplications within the tumor suppressor gene FHIT Taking aneuploidy into account, we reanalyzed K562 RNA-seq and whole-genome bisulfite sequencing data for allele-specific expression and allele-specific DNA methylation. We also show examples of how deeper insights into regulatory complexity are gained by integrating genomic variant information and structural context with functional genomics and epigenomics data. Furthermore, using K562 haplotype information, we produced an allele-specific CRISPR targeting map. This comprehensive whole-genome analysis serves as a resource for future studies that utilize K562 as well as a framework for the analysis of other cancer genomes.

View details for PubMedID 30737237
Targeted short read sequencing and assembly of re-arrangements and candidate gene loci provide megabase diplotypes. Nucleic acids research Shin, G. n., Greer, S. U., Xia, L. C., Lee, H. n., Zhou, J. n., Boles, T. C., Ji, H. P. 2019

Abstract

The human genome is composed of two haplotypes, otherwise called diplotypes, which denote phased polymorphisms and structural variations (SVs) that are derived from both parents. Diplotypes place genetic variants in the context of cis-related variants from a diploid genome. As a result, they provide valuable information about hereditary transmission, context of SV, regulation of gene expression and other features which are informative for understanding human genetics. Successful diplotyping with short read whole genome sequencing generally requires either a large population or parent-child trio samples. To overcome these limitations, we developed a targeted sequencing method for generating megabase (Mb)-scale haplotypes with short reads. One selects specific 0.1-0.2 Mb high molecular weight DNA targets with custom-designed Cas9-guide RNA complexes followed by sequencing with barcoded linked reads. To test this approach, we designed three assays, targeting the BRCA1 gene, the entire 4-Mb major histocompatibility complex locus and 18 well-characterized SVs, respectively. Using an integrated alignment- and assembly-based approach, we generated comprehensive variant diplotypes spanning the entirety of the targeted loci and characterized SVs with exact breakpoints. Our results were comparable in quality to long read sequencing.

View details for DOI 10.1093/nar/gkz661

View details for PubMedID 31350896
Improved read/write cost tradeoff in DNA-based data storage using LDPC codes Chandak, S., Tatwawadi, K., Lau, B., Mardia, J., Kubit, M., Neu, J., Griffin, P., Wootters, M., Weissman, T., Ji, H., IEEE IEEE. 2019: 147–56

View details for Web of Science ID 000535355700022
scPred: accurate supervised method for cell-type classification from single-cell RNA-seq data. Genome biology Alquicira-Hernandez, J. n., Sathe, A. n., Ji, H. P., Nguyen, Q. n., Powell, J. E. 2019; 20 (1): 264

Abstract

Single-cell RNA sequencing has enabled the characterization of highly specific cell types in many tissues, as well as both primary and stem cell-derived cell lines. An important facet of these studies is the ability to identify the transcriptional signatures that define a cell type or state. In theory, this information can be used to classify an individual cell based on its transcriptional profile. Here, we present scPred, a new generalizable method that is able to provide highly accurate classification of single cells, using a combination of unbiased feature selection from a reduced-dimension space, and machine-learning probability-based prediction method. We apply scPred to scRNA-seq data from pancreatic tissue, mononuclear cells, colorectal tumor biopsies, and circulating dendritic cells and show that scPred is able to classify individual cells with high accuracy. The generalized method is available at https://github.com/powellgenomicslab/scPred/.

View details for DOI 10.1186/s13059-019-1862-5

View details for PubMedID 31829268
Therapeutic Monitoring of Circulating DNA Mutations in Metastatic Cancer with Personalized Digital PCR. The Journal of molecular diagnostics : JMD Wood-Bouwens, C. M., Haslem, D. n., Moulton, B. n., Almeda, A. F., Lee, H. n., Heestand, G. M., Nadauld, L. D., Ji, H. P. 2019

Abstract

As a high-performance solution for longitudinal monitoring of patients being treated for metastatic cancer, we developed and a single-color digital PCR (dPCR) assay that detects and quantifies specific cancer mutations present in circulating tumor DNA (ctDNA). This customizable assay has a high sensitivity of detection. One can detect a mutation allelic fraction of 0.1%, equivalent to three mutation-bearing DNA molecules among 3,000 genome equivalents. The objective of this study was to validate the use of personalized dPCR mutation assays to monitor patients with metastatic cancer. We compared our digital PCR results to serum biomarkers indicating disease progression or response. Patients had metastatic colorectal, biliary, breast, lung and melanoma cancers. Mutations occurred in essential cancer drivers such as BRAF, KRAS and PIK3CA. We monitored patients over multiple cycles of treatment up to a year. All patients had detectable ctDNA mutations. Our results correlated with serum markers of metastatic cancer burden including CEA, CA-19-9, and CA-15-3, and qualitatively corresponding to imaging studies. We observed corresponding trends among these patients receiving active treatment with chemotherapy or targeted agents. For example, in one patient under active treatment, we detected increasing quantities of ctDNA molecules over time, indicating recurrence of tumor. Our study demonstrates that personalized digital PCR enables longitudinal monitoring of patients with metastatic cancer and maybe a useful indicator for treatment response.

View details for DOI 10.1016/j.jmoldx.2019.10.008

View details for PubMedID 31837432
Modeling the Evolution of Ploidy in a Resource Restricted Environment Kimmel, G., Barnholtz-Sloan, J., Ji, H., Altrock, P., Andor, N. edited by Bebis, G., Benos, T., Chen, K., Jahn, K., Lima, E. SPRINGER INTERNATIONAL PUBLISHING AG. 2019: 29–34

View details for DOI 10.1007/978-3-030-35210-3_2

View details for Web of Science ID 000611467600002
Covalent 'click chemistry'-based attachment of DNA onto solid phase enables iterative molecular analysis. Analytical chemistry Lau, B. T., Ji, H. P. 2019

Abstract

Molecular analysis of DNA samples with limited quantities can be challenging. Repeatedly sequencing the original DNA molecules from a given sample would overcome many issues related to accurate genetic analysis and mitigate issues with processing small amounts of DNA analyte. Moreover, an iterative, replicated analysis of the same DNA molecule has the potential to improve genetic characterization. Herein, we demonstrate that the use of 'click'-based attachment of DNA sequencing libraries onto an agarose bead support enables repetitive primer extension assays for specific genomic DNA targets such as gene exons. We validated the performance of this assay for evaluating specific genetic alterations in both normal and cancer reference standard DNA samples. We demonstrate the stability of conjugated DNA libraries and related sequencing results over the course of independent serial assays spanning several months from the same set of samples. Finally, we finally applied this method to DNA derived from a tumor sample and demonstrated improved mutation detection accuracy.

View details for PubMedID 30652472
Single-cell RNA-Seq of lymphoma cancers reveals malignant B cell types and co-expression of T cell immune checkpoints. Blood Andor, N., Simonds, E. F., Czerwinski, D. K., Chen, J., Grimes, S. M., Wood-Bouwens, C., Zheng, G. X., Kubit, M. A., Greer, S., Weiss, W. A., Levy, R., Ji, H. P. 2018

Abstract

Follicular lymphoma (FL) is a low-grade B cell malignancy that transforms into a highly aggressive and lethal disease at a rate of 2% per year. Perfect isolation of the malignant B cell population from a surgical biopsy is a significant challenge, masking important FL biology, such as immune checkpoint co-expression patterns. To resolve the underlying transcriptional networks of follicular B cell lymphomas we analyzed the transcriptomes of 34,188 cells derived from six primary FL tumors. For each tumor, we identified normal immune subpopulations and malignant B cells based on gene expression. We used multicolor flow cytometry analysis of the same tumors to confirm our assignments of cellular lineages and validate our predictions of expressed proteins. Comparison of gene expression between matched malignant and normal B cells from the same patient revealed tumor-specific features. Malignant B cells exhibited restricted immunoglobulin light chain expression (either Ig Kappa or Ig Lambda), as well the expected upregulation of the BCL2 gene, but also down-regulation of the FCER2, CD52 and MHC class II genes. By analyzing thousands of individual cells per patient tumor, we identified the mosaic of malignant B cell subclones that coexist within a FL and examined the characteristics of tumor-infiltrating T cells. We identified genes co-expressed with immune checkpoint molecules, such as CEBPA and B2M in Tregs, providing a better understanding of the gene networks involved in immune regulation. In summary, parallel measurement of single-cell expression in thousands of tumor cells and tumor-infiltrating lymphocytes can be used to obtain a systems-level view of the tumor microenvironment and identify new avenues for therapeutic development.

View details for PubMedID 30591526
Single Cell RNA Sequencing of Serial Tumor and Blood Biopsies from Lymphoma Patients on an in Situ Vaccination Clinical Trial Shree, T., Sathe, A., Czerwinski, D. K., Long, S. R., Ji, H., Levy, R. AMER SOC HEMATOLOGY. 2018

View details for DOI 10.1182/blood-2018-99-119925

View details for Web of Science ID 000454842803143
Multi-patient Longitudinal Monitoring of Cancer Mutations from Circulating DNA of using Personalized Single Color Digital PCR Assays Wood-Bouwens, C. M., Haslem, D., Lau, B. T., Almeda, A., Moulton, B., Romero, R., Nadauld, L., Ji, H. P. ELSEVIER SCIENCE INC. 2018: 1039

View details for Web of Science ID 000448637200466
SVEngine: an efficient and versatile simulator of genome structural variations with features of cancer clonal evolution. GigaScience Xia, L. C., Ai, D., Lee, H., Andor, N., Li, C., Zhang, N. R., Ji, H. P. 2018

Abstract

Background: Simulating genome sequence data with variant features facilitates the development and benchmarking of structural variant analysis programs. However, there are only a few data simulators that provide structural variants in silico and even fewer that provide variants with different allelic fraction and haplotypes.Findings: We developed SVEngine, an open source tool to address this need. SVEngine simulates next generation sequencing data with embedded structural variations. As input, SVEngine takes template haploid sequences (FASTA) and an external variant file, a variant distribution file and/or a clonal phylogeny tree file (NEWICK) as input. Subsequently, it simulates and outputs sequence contigs (FASTAs), sequence reads (FASTQs) and/or post-alignment files (BAMs). All of the files contain the desired variants, along with BED files containing the ground truth. SVEngine's flexible design process enables one to specify size, position, and allelic fraction for deletions, insertions, duplications, inversions and translocations. Finally, SVEngine simulates sequence data that replicates the characteristics of a sequencing library with mixed sizes of DNA insert molecules. To improve the compute speed, SVEngine is highly parallelized to reduce the simulation time.Conclusions: We demonstrated the versatile features of SVEngine and its improved runtime comparisons with other available simulators. SVEngine's features include the simulation of locus-specific variant frequency designed to mimic the phylogeny of cancer clonal evolution. We validated SVEngine's accuracy by simulating genome-wide structural variants of NA12878 and a heterogenous cancer genome. Our evaluation included checking various sequencing mapping features such as coverage change, read clipping, insert size shift and neighbouring hanging read pairs for representative variant types. Structural variant callers Lumpy and Manta and tumor heterogeneity estimator THetA2 were able to perform realistically on the simulated data. SVEngine is implemented as a standard Python package and is freely available for academic use at: https://bitbucket.org/charade/svengine.

View details for PubMedID 29982625
SVEngine: an efficient and versatile simulator of genome structural variations with features of cancer clonal evolution GIGASCIENCE Xia, L., Ai, D., Lee, H., Andor, N., Li, C., Zhang, N. R., Ji, H. P. 2018; 7 (7)

View details for DOI 10.1093/gigascience/giy081

View details for Web of Science ID 000440187400001
Integrated single-cell DNA and RNA analysis of intratumoral heterogeneity and immune lineages in colorectal and gastric tumor biopsies Lau, B., Andor, N., Sathe, A., Wood-Bouwens, C., Poultsides, G., Ji, H. AMER ASSOC CANCER RESEARCH. 2018

View details for DOI 10.1158/1538-7445.AM2018-4347

View details for Web of Science ID 000468819503011
Characterization of colorectal liver metastasis at single-cell resolution reveals dynamic interplay in the tumor microenvironment Sathe, A., Chen, J., Wood-Bouwens, C., Almeda, A., Lau, B., Grimes, S. M., Poultsides, G. A., Ji, H. AMER ASSOC CANCER RESEARCH. 2018

View details for DOI 10.1158/1538-7445.AM2018-2126

View details for Web of Science ID 000468818904508
Chromosome-scale haplotyping enables comprehensive discovery of cancer rearrangements and germline-related susceptibility mutations Greer, S. U., Lau, B. T., Nadauld, L. D., Ji, H. P. AMER ASSOC CANCER RESEARCH. 2018

View details for DOI 10.1158/1538-7445.AM2018-1280

View details for Web of Science ID 000468818903252
Highly sensitive digital detection of circulating DNA cancer mutations using synthetic genome standards Wood-Bouwens, C. M., St Onge, R. P., Ji, H. P. AMER ASSOC CANCER RESEARCH. 2018

View details for DOI 10.1158/1538-7445.AM2018-1604

View details for Web of Science ID 000468818904003
Linked read whole genome sequencing reveals pervasive chromosomal level instability and novel rearrangements in brain metastases from colorectal cancer Xia, L. C., Bell, J. M., Wood-Bouwens, C., King, D. A., Shin, G., Greer, S., Connolly, I. D., Gephart, M. H., Ji, H. P. AMER ASSOC CANCER RESEARCH. 2018

View details for DOI 10.1158/1538-7445.AM2018-4334

View details for Web of Science ID 000468819502525
Improved detection and identification of microsatellite instability features in colorectal cancer: Implications for immunotherapy Shin, G., Lee, H., Grimes, S. M., Kubit, M. A., Ji, H. P. AMER ASSOC CANCER RESEARCH. 2018

View details for DOI 10.1158/1538-7445.AM2018-421

View details for Web of Science ID 000468818901486
High-quality CNV segments from low-coverage whole genome sequencing from FFPE cancer biopsies based on an evaluation of multiple CNV tools Lee, H., Xia, L., Greer, S., Bell, J., Grimes, S. M., Bouwens, C., Shin, G., Lau, B. T. C., Johnson, L., Andor, N., Day, K., Miller, M., Escobar, H., Nadauld, L., Ji, H. P., Van Hummelen, P. AMER ASSOC CANCER RESEARCH. 2018

View details for DOI 10.1158/1538-7445.AM2018-438

View details for Web of Science ID 000468818901502
Mapping the comprehensive landscape of missense-mutation neoantigens across the human genome Lee, H., Greer, S. U., Ji, H. P. AMER ASSOC CANCER RESEARCH. 2018

View details for DOI 10.1158/1538-7445.AM2018-1298

View details for Web of Science ID 000468818903270
Loss of TP53 as a prognostic biomarker of poor survival in stage III colorectal cancer patients. Nadauld, L., Van Hummelen, P., Xia, L., Day, K., Lee, H., Bell, J., Grimes, S. M., Kubit, M., Miller, M., Shin, G., Wood, C., Greer, S., Escobar, H., Haslem, D. S., Ji, H. AMER SOC CLINICAL ONCOLOGY. 2018

View details for DOI 10.1200/JCO.2018.36.15_suppl.e15588

View details for Web of Science ID 000442916005065
Identification of large rearrangements in cancer genomes with barcode linked reads. Nucleic acids research Xia, L. C., Bell, J. M., Wood-Bouwens, C. n., Chen, J. J., Zhang, N. R., Ji, H. P. 2018; 46 (4): e19

Abstract

Large genomic rearrangements involve inversions, deletions and other structural changes that span Megabase segments of the human genome. This category of genetic aberration is the cause of many hereditary genetic disorders and contributes to pathogenesis of diseases like cancer. We developed a new algorithm called ZoomX for analysing barcode-linked sequence reads-these sequences can be traced to individual high molecular weight DNA molecules (>50 kb). To generate barcode linked sequence reads, we employ a library preparation technology (10X Genomics) that uses droplets to partition and barcode DNA molecules. Using linked read data from whole genome sequencing, we identify large genomic rearrangements, typically greater than 200kb, even when they are only present in low allelic fractions. Our algorithm uses a Poisson scan statistic to identify genomic rearrangement junctions, determine counts of junction-spanning molecules and calculate a Fisher's exact test for determining statistical significance for somatic aberrations. Utilizing a well-characterized human genome, we benchmarked this approach to accurately identify large rearrangement. Subsequently, we demonstrated that our algorithm identifies somatic rearrangements when present in lower allelic fractions as occurs in tumors. We characterized a set of complex cancer rearrangements with multiple classes of structural aberrations and with possible roles in oncogenesis.

View details for PubMedID 29186506
Single Color Multiplexed ddPCR Copy Number Measurements and Single Nucleotide Variant Genotyping DIGITAL PCR: METHODS AND PROTOCOLS Wood-Bouwens, C. M., Ji, H. P. edited by KarlinNeumann, G., Bizouarn, F. 2018; 1768: 323–33

View details for DOI 10.1007/978-1-4939-7778-9_18

View details for Web of Science ID 000443084300019
Robust Multiplexed Clustering and Denoising of Digital PCR Assays by Data Gridding ANALYTICAL CHEMISTRY Lau, B. T., Wood-Bouwens, C., Ji, H. P. 2017; 89 (22): 11913–17

Abstract

Digital PCR (dPCR) relies on the analysis of individual partitions to accurately quantify nucleic acid species. The most widely used analysis method requires manual clustering through individual visual inspection. Some automated analysis methods have emerged but do not robustly account for multiplexed targets, low target concentration, and assay noise. In this study, we describe an open source analysis software called Calico that uses "data gridding" to increase the sensitivity of clustering toward small clusters. Our workflow also generates quality score metrics in order to gauge and filter individual assay partitions by how well they were classified. We applied our analysis algorithm to multiplexed droplet-based digital PCR data sets in both EvaGreen and probes-based schemes, and targeted the oncogenic BRAF V600E and KRAS G12D mutations. We demonstrate an automated clustering sensitivity of down to 0.1% mutant fraction and filtering of artifactual assay partitions from low quality DNA samples. Overall, we demonstrate a vastly improved approach to analyzing ddPCR data that can be applied to clinical use, where automation and reproducibility are critical.

View details for PubMedID 29083143
Chromosome-scale mega-haplotypes enable digital karyotyping of cancer aneuploidy NUCLEIC ACIDS RESEARCH Bell, J. M., Lau, B. T., Greer, S. U., Wood-Bouwens, C., Xia, L. C., Connolly, I. D., Gephart, M. H., Ji, H. P. 2017; 45 (19): e162

Abstract

Genomic instability is a frequently occurring feature of cancer that involves large-scale structural alterations. These somatic changes in chromosome structure include duplication of entire chromosome arms and aneuploidy where chromosomes are duplicated beyond normal diploid content. However, the accurate determination of aneuploidy events in cancer genomes is a challenge. Recent advances in sequencing technology allow the characterization of haplotypes that extend megabases along the human genome using high molecular weight (HMW) DNA. For this study, we employed a library preparation method in which sequence reads have barcodes linked to single HMW DNA molecules. Barcode-linked reads are used to generate extended haplotypes on the order of megabases. We developed a method that leverages haplotypes to identify chromosomal segmental alterations in cancer and uses this information to join haplotypes together, thus extending the range of phased variants. With this approach, we identified mega-haplotypes that encompass entire chromosome arms. We characterized the chromosomal arm changes and aneuploidy events in a manner that offers similar information as a traditional karyotype but with the benefit of DNA sequence resolution. We applied this approach to characterize aneuploidy and chromosomal alterations from a series of primary colorectal cancers.

View details for PubMedID 28977555

View details for PubMedCentralID PMC5737808
High Performance Detection of Cancer Mutations from Circulating DNA Using Single Color Digital PCR Lau, B. T., Handy, C. M., Lee, H., Wood-Bouwens, C. M., Ji, H. P. ELSEVIER SCIENCE INC. 2017: 1064

View details for Web of Science ID 000414275900512
Synthetic lethality screen identifies novel druggable targets in the MYC pathway Li, Y., Deutzmann, A., Bell, J., Ji, H., Felsher, D. AMER ASSOC CANCER RESEARCH. 2017

View details for DOI 10.1158/1538-8514.SYNTHLETH-PR02

View details for Web of Science ID 000412270800076
Single molecule counting and assessment of random molecular tagging errors with transposable giga-scale error-correcting barcodes BMC GENOMICS Lau, B. T., Ji, H. P. 2017; 18: 745

Abstract

RNA-Seq measures gene expression by counting sequence reads belonging to unique cDNA fragments. Molecular barcodes commonly in the form of random nucleotides were recently introduced to improve gene expression measures by detecting amplification duplicates, but are susceptible to errors generated during PCR and sequencing. This results in false positive counts, leading to inaccurate transcriptome quantification especially at low input and single-cell RNA amounts where the total number of molecules present is minuscule. To address this issue, we demonstrated the systematic identification of molecular species using transposable error-correcting barcodes that are exponentially expanded to tens of billions of unique labels.We experimentally showed random-mer molecular barcodes suffer from substantial and persistent errors that are difficult to resolve. To assess our method's performance, we applied it to the analysis of known reference RNA standards. By including an inline random-mer molecular barcode, we systematically characterized the presence of sequence errors in random-mer molecular barcodes. We observed that such errors are extensive and become more dominant at low input amounts.We described the first study to use transposable molecular barcodes and its use for studying random-mer molecular barcode errors. Extensive errors found in random-mer molecular barcodes may warrant the use of error correcting barcodes for transcriptome analysis as input amounts decrease.

View details for PubMedID 28934929
Single-Color Digital PCR Provides High-Performance Detection of Cancer Mutations from Circulating DNA. The Journal of molecular diagnostics : JMD Wood-Bouwens, C., Lau, B. T., Handy, C. M., Lee, H., Ji, H. P. 2017; 19 (5): 697-710

Abstract

We describe a single-color digital PCR assay that detects and quantifies cancer mutations directly from circulating DNA collected from the plasma of cancer patients. This approach relies on a double-stranded DNA intercalator dye and paired allele-specific DNA primer sets to determine an absolute count of both the mutation and wild-type-bearing DNA molecules present in the sample. The cell-free DNA assay uses an input of 1 ng of nonamplified DNA, approximately 300 genome equivalents, and has a molecular limit of detection of three mutation DNA genome-equivalent molecules per assay reaction. When using more genome equivalents as input, we demonstrated a sensitivity of 0.10% for detecting the BRAF V600E and KRAS G12D mutations. We developed several mutation assays specific to the cancer driver mutations of patients' tumors and detected these same mutations directly from the nonamplified, circulating cell-free DNA. This rapid and high-performance digital PCR assay can be configured to detect specific cancer mutations unique to an individual cancer, making it a potentially valuable method for patient-specific longitudinal monitoring.

View details for DOI 10.1016/j.jmoldx.2017.05.003

View details for PubMedID 28818432
Intestinal Enteroendocrine Lineage Cells Possess Homeostatic and Injury-Inducible Stem Cell Activity. Cell stem cell Yan, K. S., Gevaert, O., Zheng, G. X., Anchang, B., Probert, C. S., Larkin, K. A., Davies, P. S., Cheng, Z. F., Kaddis, J. S., Han, A., Roelf, K., Calderon, R. I., Cynn, E., Hu, X., Mandleywala, K., Wilhelmy, J., Grimes, S. M., Corney, D. C., Boutet, S. C., Terry, J. M., Belgrader, P., Ziraldo, S. B., Mikkelsen, T. S., Wang, F., von Furstenberg, R. J., Smith, N. R., Chandrakesan, P., May, R., Chrissy, M. A., Jain, R., Cartwright, C. A., Niland, J. C., Hong, Y. K., Carrington, J., Breault, D. T., Epstein, J., Houchen, C. W., Lynch, J. P., Martin, M. G., Plevritis, S. K., Curtis, C., Ji, H. P., Li, L., Henning, S. J., Wong, M. H., Kuo, C. J. 2017; 21 (1): 78-90.e6

Abstract

Several cell populations have been reported to possess intestinal stem cell (ISC) activity during homeostasis and injury-induced regeneration. Here, we explored inter-relationships between putative mouse ISC populations by comparative RNA-sequencing (RNA-seq). The transcriptomes of multiple cycling ISC populations closely resembled Lgr5+ISCs, the most well-defined ISC pool, but Bmi1-GFP+cells were distinct and enriched for enteroendocrine (EE) markers, including Prox1. Prox1-GFP+cells exhibited sustained clonogenic growth in vitro, and lineage-tracing of Prox1+cells revealed long-lived clones during homeostasis and after radiation-induced injury in vivo. Single-cell mRNA-seq revealed two subsets of Prox1-GFP+cells, one of which resembled mature EE cells while the other displayed low-level EE gene expression but co-expressed tuft cell markers, Lgr5 and Ascl2, reminiscent of label-retaining secretory progenitors. Our data suggest that the EE lineage, including mature EE cells, comprises a reservoir of homeostatic and injury-inducible ISCs, extending our understanding of cellular plasticity and stemness.

View details for DOI 10.1016/j.stem.2017.06.014

View details for PubMedID 28686870

View details for PubMedCentralID PMC5642297
Precision Oncology Strategy in Trastuzumab-Resistant Human Epidermal Growth Factor Receptor 2-Positive Colon Cancer: Case Report of Durable Response to Ado-Trastuzumab Emtansine. JCO precision oncology Haslem, D. S., Ji, H. P., Ford, J. M., Nadauld, L. D. 2017; 1

View details for DOI 10.1200/PO.16.00055

View details for PubMedID 32913966

View details for PubMedCentralID PMC7446358
Genomic Instability in Cancer: Teetering on the Limit of Tolerance CANCER RESEARCH Andor, N., Maley, C. C., Ji, H. P. 2017; 77 (9): 2179-2185

Abstract

Cancer genomic instability contributes to the phenomenon of intratumoral genetic heterogeneity, provides the genetic diversity required for natural selection, and enables the extensive phenotypic diversity that is frequently observed among patients. Genomic instability has previously been associated with poor prognosis. However, we have evidence that for solid tumors of epithelial origin, extreme levels of genomic instability, where more than 75% of the genome is subject to somatic copy number alterations, are associated with a potentially better prognosis compared with intermediate levels under this threshold. This has been observed in clonal subpopulations of larger size, especially when genomic instability is shared among a limited number of clones. We hypothesize that cancers with extreme levels of genomic instability may be teetering on the brink of a threshold where so much of their genome is adversely altered that cells rarely replicate successfully. Another possibility is that tumors with high levels of genomic instability are more immunogenic than other cancers with a less extensive burden of genetic aberrations. Regardless of the exact mechanism, but hinging on our ability to quantify how a tumor's burden of genetic aberrations is distributed among coexisting clones, genomic instability has important therapeutic implications. Herein, we explore the possibility that a high genomic instability could be the basis for a tumor's sensitivity to DNA-damaging therapies. We primarily focus on studies of epithelial-derived solid tumors. Cancer Res; 77(9); 2179-85. ©2017 AACR.

View details for DOI 10.1158/0008-5472.CAN-16-1553

View details for Web of Science ID 000400270100001

View details for PubMedID 28432052

View details for PubMedCentralID PMC5413432
Tandem Oligonucleotide Probe Annealing and Elongation To Discriminate Viral Sequence ANALYTICAL CHEMISTRY Taskova, M., Uhd, J., Miotke, L., Kubit, M., Bell, J., Ji, H. P., Astakhova, K. 2017; 89 (8): 4363-4366

Abstract

New approaches for genomic DNA/RNA detection are in high demand in order to provide controls for existing enzymatic technologies and to create alternatives for emerging applications. In particular, there is an unmet need in rapid, reliable detection of short RNA regions which could open up new opportunities in transcriptome analysis, virology, and other fields. Herein, we report for the first time a "click" chemistry approach to oligonucleotide probe elongation as a novel approach to specifically detect a viral sequence. We hybridized a library of short, terminally labeled probes to Ebola virus RNA followed by click assembly and analysis of the read sequence by various techniques. As we demonstrate in this paper, using our new approach, a viral RNA sequence can be detected in less than 2 h without the need for cDNA synthesis or any other enzymatic reactions and with a sensitivity of <10 pM target RNA.

View details for DOI 10.1021/acs.analchem.7b00646

View details for Web of Science ID 000399858800008

View details for PubMedID 28382823
A Targeted Resequencing Approach to Identify Actionable Somatic Copy Number Alterations with High Sensitivity Alongside SNVs and Indels from Clinical Tumor Specimens De La Vega, F. M., Mendoza, D., Bouhlai, Y., Vilborg, A., Koehler, R., Pouliot, Y., Irvine, S., Trig, L., Goodsaid, F., Ji, H. P. ELSEVIER SCIENCE INC. 2017: S48

View details for Web of Science ID 000568304400100
CRISPR-Cas9-targeted fragmentation and selective sequencing enable massively parallel microsatellite analysis NATURE COMMUNICATIONS Shin, G., Grimes, S. M., Lee, H., Lau, B. T., Xia, L. C., Ji, H. P. 2017; 8

Abstract

Microsatellites are multi-allelic and composed of short tandem repeats (STRs) with individual motifs composed of mononucleotides, dinucleotides or higher including hexamers. Next-generation sequencing approaches and other STR assays rely on a limited number of PCR amplicons, typically in the tens. Here, we demonstrate STR-Seq, a next-generation sequencing technology that analyses over 2,000 STRs in parallel, and provides the accurate genotyping of microsatellites. STR-Seq employs in vitro CRISPR-Cas9-targeted fragmentation to produce specific DNA molecules covering the complete microsatellite sequence. Amplification-free library preparation provides single molecule sequences without unique molecular barcodes. STR-selective primers enable massively parallel, targeted sequencing of large STR sets. Overall, STR-Seq has higher throughput, improved accuracy and provides a greater number of informative haplotypes compared with other microsatellite analysis approaches. With these new features, STR-Seq can identify a 0.1% minor genome fraction in a DNA mixture composed of different, unrelated samples.

View details for DOI 10.1038/ncomms14291

View details for PubMedID 28169275
Linked read sequencing resolves complex genomic rearrangements in gastric cancer metastases. Genome medicine Greer, S. U., Nadauld, L. D., Lau, B. T., Chen, J. n., Wood-Bouwens, C. n., Ford, J. M., Kuo, C. J., Ji, H. P. 2017; 9 (1): 57

Abstract

Genome rearrangements are critical oncogenic driver events in many malignancies. However, the identification and resolution of the structure of cancer genomic rearrangements remain challenging even with whole genome sequencing.To identify oncogenic genomic rearrangements and resolve their structure, we analyzed linked read sequencing. This approach relies on a microfluidic droplet technology to produce libraries derived from single, high molecular weight DNA molecules, 50 kb in size or greater. After sequencing, the barcoded sequence reads provide long range genomic information, identify individual high molecular weight DNA molecules, determine the haplotype context of genetic variants that occur across contiguous megabase-length segments of the genome and delineate the structure of complex rearrangements. We applied linked read sequencing of whole genomes to the analysis of a set of synchronous metastatic diffuse gastric cancers that occurred in the same individual.When comparing metastatic sites, our analysis implicated a complex somatic rearrangement that was present in the metastatic tumor. The oncogenic event associated with the identified complex rearrangement resulted in an amplification of the known cancer driver gene FGFR2. With further investigation using these linked read data, the FGFR2 copy number alteration was determined to be a deletion-inversion motif that underwent tandem duplication, with unique breakpoints in each metastasis. Using a three-dimensional organoid tissue model, we functionally validated the metastatic potential of an FGFR2 amplification in gastric cancer.Our study demonstrates that linked read sequencing is useful in characterizing oncogenic rearrangements in cancer metastasis.

View details for PubMedID 28629429
Precision Oncology Strategy in Trastuzumab-Resistant Human Epidermal Growth Factor Receptor 2-Positive Colon Cancer: Case Report of Durable Response to Ado-Trastuzumab Emtansine JCO PRECISION ONCOLOGY Haslem, D. S., Ji, H. P., Ford, J. M., Nadauld, L. D. 2017; 1

View details for DOI 10.1200/PO.16.00055

View details for Web of Science ID 000462058200017
Intestinal Enteroendocrine Lineage Cells Possess Homeostatic and Injury-Inducible Stem Cell Activity Cell Stem Cell Yan, K., Gevaert, O., Zheng, G., Anchang, B., Probert, C., et al 2017; 21 (1): 78 - 90.e6

Abstract

Several cell populations have been reported to possess intestinal stem cell (ISC) activity during homeostasis and injury-induced regeneration. Here, we explored inter-relationships between putative mouse ISC populations by comparative RNA-sequencing (RNA-seq). The transcriptomes of multiple cycling ISC populations closely resembled Lgr5+ISCs, the most well-defined ISC pool, but Bmi1-GFP+cells were distinct and enriched for enteroendocrine (EE) markers, including Prox1. Prox1-GFP+cells exhibited sustained clonogenic growth in vitro, and lineage-tracing of Prox1+cells revealed long-lived clones during homeostasis and after radiation-induced injury in vivo. Single-cell mRNA-seq revealed two subsets of Prox1-GFP+cells, one of which resembled mature EE cells while the other displayed low-level EE gene expression but co-expressed tuft cell markers, Lgr5 and Ascl2, reminiscent of label-retaining secretory progenitors. Our data suggest that the EE lineage, including mature EE cells, comprises a reservoir of homeostatic and injury-inducible ISCs, extending our understanding of cellular plasticity and stemness.

View details for DOI 10.1016/j.stem.2017.06.014

View details for PubMedCentralID PMC5642297
Massively Parallel Single Cell RNA-Seq of Primary Lymphomas Reveals Distinct Cellular Lineages and Diverse, Intratumoral Transcriptional States Andor, N., Simonds, E., Chen, J., Grimes, S., Wood, C., Czerwinski, D. K., Handy, C., Levy, R., Ji, H. P. AMER SOC HEMATOLOGY. 2016

View details for Web of Science ID 000394446803089
A genome-wide approach for detecting novel insertion-deletion variants of mid-range size. Nucleic acids research Xia, L. C., Sakshuwong, S., Hopmans, E. S., Bell, J. M., Grimes, S. M., Siegmund, D. O., Ji, H. P., Zhang, N. R. 2016; 44 (15)

Abstract

We present SWAN, a statistical framework for robust detection of genomic structural variants in next-generation sequencing data and an analysis of mid-range size insertion and deletions (<10 Kb) for whole genome analysis and DNA mixtures. To identify these mid-range size events, SWAN collectively uses information from read-pair, read-depth and one end mapped reads through statistical likelihoods based on Poisson field models. SWAN also uses soft-clip/split read remapping to supplement the likelihood analysis and determine variant boundaries. The accuracy of SWAN is demonstrated by in silico spike-ins and by identification of known variants in the NA12878 genome. We used SWAN to identify a series of novel set of mid-range insertion/deletion detection that were confirmed by targeted deep re-sequencing. An R package implementation of SWAN is open source and freely available.

View details for DOI 10.1093/nar/gkw481

View details for PubMedID 27325742

View details for PubMedCentralID PMC5009736
Haplotyping germline and cancer genomes with high-throughput linked-read sequencing. Nature biotechnology Zheng, G. X., Lau, B. T., Schnall-Levin, M., Jarosz, M., Bell, J. M., Hindson, C. M., Kyriazopoulou-Panagiotopoulou, S., Masquelier, D. A., Merrill, L., Terry, J. M., Mudivarti, P. A., Wyatt, P. W., Bharadwaj, R., Makarewicz, A. J., Li, Y., Belgrader, P., Price, A. D., Lowe, A. J., Marks, P., Vurens, G. M., Hardenbol, P., Montesclaros, L., Luo, M., Greenfield, L., Wong, A., Birch, D. E., Short, S. W., Bjornson, K. P., Patel, P., Hopmans, E. S., Wood, C., Kaur, S., Lockwood, G. K., Stafford, D., Delaney, J. P., Wu, I., Ordonez, H. S., Grimes, S. M., Greer, S., Lee, J. Y., Belhocine, K., Giorda, K. M., Heaton, W. H., McDermott, G. P., Bent, Z. W., Meschi, F., Kondov, N. O., Wilson, R., Bernate, J. A., Gauby, S., Kindwall, A., Bermejo, C., Fehr, A. N., Chan, A., Saxonov, S., Ness, K. D., Hindson, B. J., Ji, H. P. 2016; 34 (3): 303-311

Abstract

Haplotyping of human chromosomes is a prerequisite for cataloguing the full repertoire of genetic variation. We present a microfluidics-based, linked-read sequencing technology that can phase and haplotype germline and cancer genomes using nanograms of input DNA. This high-throughput platform prepares barcoded libraries for short-read sequencing and computationally reconstructs long-range haplotype and structural variant information. We generate haplotype blocks in a nuclear trio that are concordant with expected inheritance patterns and phase a set of structural variants. We also resolve the structure of the EML4-ALK gene fusion in the NCI-H2228 cancer cell line using phased exome sequencing. Finally, we assign genetic aberrations to specific megabase-scale haplotypes generated from whole-genome sequencing of a primary colorectal adenocarcinoma. This approach resolves haplotype information using up to 100 times less genomic DNA than some methods and enables the accurate detection of structural variants.

View details for DOI 10.1038/nbt.3432

View details for PubMedID 26829319

View details for PubMedCentralID PMC4786454
Pan-cancer analysis of the extent and consequences of intratumor heterogeneity. Nature medicine Andor, N., Graham, T. A., Jansen, M., Xia, L. C., Aktipis, C. A., Petritsch, C., Ji, H. P., Maley, C. C. 2016; 22 (1): 105-113

Abstract

Intratumor heterogeneity (ITH) drives neoplastic progression and therapeutic resistance. We used the bioinformatics tools 'expanding ploidy and allele frequency on nested subpopulations' (EXPANDS) and PyClone to detect clones that are present at a ≥10% frequency in 1,165 exome sequences from tumors in The Cancer Genome Atlas. 86% of tumors across 12 cancer types had at least two clones. ITH in the morphology of nuclei was associated with genetic ITH (Spearman's correlation coefficient, ρ = 0.24-0.41; P < 0.001). Mutation of a driver gene that typically appears in smaller clones was a survival risk factor (hazard ratio (HR) = 2.15, 95% confidence interval (CI): 1.71-2.69). The risk of mortality also increased when >2 clones coexisted in the same tumor sample (HR = 1.49, 95% CI: 1.20-1.87). In two independent data sets, copy-number alterations affecting either <25% or >75% of a tumor's genome predicted reduced risk (HR = 0.15, 95% CI: 0.08-0.29). Mortality risk also declined when >4 clones coexisted in the sample, suggesting a trade-off between the costs and benefits of genomic instability. ITH and genomic instability thus have the potential to be useful measures that can universally be applied to all cancers.

View details for DOI 10.1038/nm.3984

View details for PubMedID 26618723
Pan-cancer analysis of the etiology and consequences of intra-tumor heterogeneity Andor, N., Graham, T. A., Petritsch, C., Ji, H. P., Maley, C. C. AMER ASSOC CANCER RESEARCH. 2015

View details for DOI 10.1158/1538-7445.TRANSCAGEN-PR03

View details for Web of Science ID 000370972600117
Pan-cancer analysis of the etiology and consequences of intratumor heterogeneity Andor, N., Graham, T. A., Petritsch, C., Ji, H. P., Maley, C. C. AMER ASSOC CANCER RESEARCH. 2015

View details for DOI 10.1158/1538-7445.TRANSCAGEN-A1-54

View details for Web of Science ID 000370972600044
The Cancer Genome Atlas Clinical Explorer: a web and mobile interface for identifying clinical-genomic driver associations GENOME MEDICINE Lee, H., Palm, J., Grimes, S. M., Ji, H. P. 2015; 7

Abstract

The Cancer Genome Atlas (TCGA) project has generated genomic data sets covering over 20 malignancies. These data provide valuable insights into the underlying genetic and genomic basis of cancer. However, exploring the relationship among TCGA genomic results and clinical phenotype remains a challenge, particularly for individuals lacking formal bioinformatics training. Overcoming this hurdle is an important step toward the wider clinical translation of cancer genomic/proteomic data and implementation of precision cancer medicine. Several websites such as the cBio portal or University of California Santa Cruz genome browser make TCGA data accessible but lack interactive features for querying clinically relevant phenotypic associations with cancer drivers. To enable exploration of the clinical-genomic driver associations from TCGA data, we developed the Cancer Genome Atlas Clinical Explorer.The Cancer Genome Atlas Clinical Explorer interface provides a straightforward platform to query TCGA data using one of the following methods: (1) searching for clinically relevant genes, micro RNAs, and proteins by name, cancer types, or clinical parameters; (2) searching for genomic/proteomic profile changes by clinical parameters in a cancer type; or (3) testing two-hit hypotheses. SQL queries run in the background and results are displayed on our portal in an easy-to-navigate interface according to user's input. To derive these associations, we relied on elastic-net estimates of optimal multiple linear regularized regression and clinical parameters in the space of multiple genomic/proteomic features provided by TCGA data. Moreover, we identified and ranked gene/micro RNA/protein predictors of each clinical parameter for each cancer. The robustness of the results was estimated by bootstrapping. Overall, we identify associations of potential clinical relevance among genes/micro RNAs/proteins using our statistical analysis from 25 cancer types and 18 clinical parameters that include clinical stage or smoking history.The Cancer Genome Atlas Clinical Explorer enables the cancer research community and others to explore clinically relevant associations inferred from TCGA data. With its accessible web and mobile interface, users can examine queries and test hypothesis regarding genomic/proteomic alterations across a broad spectrum of malignancies.

View details for DOI 10.1186/s13073-015-0226-3

View details for Web of Science ID 000363619100002

View details for PubMedID 26507825

View details for PubMedCentralID PMC4624593
Enzyme-Free Detection of Mutations in Cancer DNA Using Synthetic Oligonucleotide Probes and Fluorescence Microscopy PLOS ONE Miotke, L., Maity, A., Ji, H., Brewer, J., Astakhova, K. 2015; 10 (8)

Abstract

Rapid reliable diagnostics of DNA mutations are highly desirable in research and clinical assays. Current development in this field goes simultaneously in two directions: 1) high-throughput methods, and 2) portable assays. Non-enzymatic approaches are attractive for both types of methods since they would allow rapid and relatively inexpensive detection of nucleic acids. Modern fluorescence microscopy is having a huge impact on detection of biomolecules at previously unachievable resolution. However, no straightforward methods to detect DNA in a non-enzymatic way using fluorescence microscopy and nucleic acid analogues have been proposed so far.Here we report a novel enzyme-free approach to efficiently detect cancer mutations. This assay includes gene-specific target enrichment followed by annealing to oligonucleotides containing locked nucleic acids (LNAs) and finally, detection by fluorescence microscopy. The LNA containing probes display high binding affinity and specificity to DNA containing mutations, which allows for the detection of mutation abundance with an intercalating EvaGreen dye. We used a second probe, which increases the overall number of base pairs in order to produce a higher fluorescence signal by incorporating more dye molecules. Indeed we show here that using EvaGreen dye and LNA probes, genomic DNA containing BRAF V600E mutation could be detected by fluorescence microscopy at low femtomolar concentrations. Notably, this was at least 1000-fold above the potential detection limit.Overall, the novel assay we describe could become a new approach to rapid, reliable and enzyme-free diagnostics of cancer or other associated DNA targets. Importantly, stoichiometry of wild type and mutant targets is conserved in our assay, which allows for an accurate estimation of mutant abundance when the detection limit requirement is met. Using fluorescence microscopy, this approach presents the opportunity to detect DNA at single-molecule resolution and directly in the biological sample of choice.

View details for DOI 10.1371/journal.pone.0136720

View details for Web of Science ID 000360144000090

View details for PubMedCentralID PMC4552304
A new multiple feature approach for rapid and highly accurate somatic structural variation discovery from whole cancer genome sequencing Xia, L. C., Bell, J., Chen, J., Zhang, N. R., Ji, H. P. AMER ASSOC CANCER RESEARCH. 2015

View details for DOI 10.1158/1538-7445.AM2015-4871

View details for Web of Science ID 000371597105056
Identification of novel tumor suppressor candidates and characterizing their potential driver role in familial cholangiocarcinoma Greer, S., Nadauld, L. D., Lau, B., Miotke, L., Hopmans, E., Wood, C. M., Bell, J. M., Ji, H. P. AMER ASSOC CANCER RESEARCH. 2015

View details for DOI 10.1158/1538-7445.AM2015-3901

View details for Web of Science ID 000371597102425
Megabase-scale phased haplotypes of genetic aberrations from whole cancer genome sequencing of primary colorectal tumors Lau, B., Bell, J. M., Schnall-Levin, M., Jarosz, M., Hopmans, E., Wood, C. M., Zheng, G. X., Giorda, K., Ji, H. P. AMER ASSOC CANCER RESEARCH. 2015

View details for DOI 10.1158/1538-7445.AM2015-4882

View details for Web of Science ID 000371597105067
Clonal structure analysis of cancer genomes at single molecule resolution Lau, B., Ji, H. AMER ASSOC CANCER RESEARCH. 2015

View details for DOI 10.1158/1538-7445.AM2015-4889

View details for Web of Science ID 000371597105074
Pan-cancer analysis of the causes and consequences of Intra-tumor heterogeneity Andor, N., Graham, T. A., Aktipis, A. C., Petritsch, C., Ji, H. P., Maley, C. C. AMER ASSOC CANCER RESEARCH. 2015

View details for DOI 10.1158/1538-7445.AM2015-LB-163

View details for Web of Science ID 000371597100263
Allele-specific copy number profiling by next-generation DNA sequencing. Nucleic acids research Chen, H., Bell, J. M., Zavala, N. A., Ji, H. P., Zhang, N. R. 2015; 43 (4)

Abstract

The progression and clonal development of tumors often involve amplifications and deletions of genomic DNA. Estimation of allele-specific copy number, which quantifies the number of copies of each allele at each variant loci rather than the total number of chromosome copies, is an important step in the characterization of tumor genomes and the inference of their clonal history. We describe a new method, falcon, for finding somatic allele-specific copy number changes by next generation sequencing of tumors with matched normals. falcon is based on a change-point model on a bivariate mixed Binomial process, which explicitly models the copy numbers of the two chromosome haplotypes and corrects for local allele-specific coverage biases. By using the Binomial distribution rather than a normal approximation, falcon more effectively pools evidence from sites with low coverage. A modified Bayesian information criterion is used to guide model selection for determining the number of copy number events. Falcon is evaluated on in silico spike-in data and applied to the analysis of a pre-malignant colon tumor sample and late-stage colorectal adenocarcinoma from the same individual. The allele-specific copy number estimates obtained by falcon allows us to draw detailed conclusions regarding the clonal history of the individual's colon cancer.

View details for DOI 10.1093/nar/gku1252

View details for PubMedID 25477383
Enzyme-Free Detection of Mutations in Cancer DNA Using Synthetic Oligonucleotide Probes and Fluorescence Microscopy. PloS one Miotke, L., Maity, A., Ji, H., Brewer, J., Astakhova, K. 2015; 10 (8)

Abstract

Rapid reliable diagnostics of DNA mutations are highly desirable in research and clinical assays. Current development in this field goes simultaneously in two directions: 1) high-throughput methods, and 2) portable assays. Non-enzymatic approaches are attractive for both types of methods since they would allow rapid and relatively inexpensive detection of nucleic acids. Modern fluorescence microscopy is having a huge impact on detection of biomolecules at previously unachievable resolution. However, no straightforward methods to detect DNA in a non-enzymatic way using fluorescence microscopy and nucleic acid analogues have been proposed so far.Here we report a novel enzyme-free approach to efficiently detect cancer mutations. This assay includes gene-specific target enrichment followed by annealing to oligonucleotides containing locked nucleic acids (LNAs) and finally, detection by fluorescence microscopy. The LNA containing probes display high binding affinity and specificity to DNA containing mutations, which allows for the detection of mutation abundance with an intercalating EvaGreen dye. We used a second probe, which increases the overall number of base pairs in order to produce a higher fluorescence signal by incorporating more dye molecules. Indeed we show here that using EvaGreen dye and LNA probes, genomic DNA containing BRAF V600E mutation could be detected by fluorescence microscopy at low femtomolar concentrations. Notably, this was at least 1000-fold above the potential detection limit.Overall, the novel assay we describe could become a new approach to rapid, reliable and enzyme-free diagnostics of cancer or other associated DNA targets. Importantly, stoichiometry of wild type and mutant targets is conserved in our assay, which allows for an accurate estimation of mutant abundance when the detection limit requirement is met. Using fluorescence microscopy, this approach presents the opportunity to detect DNA at single-molecule resolution and directly in the biological sample of choice.

View details for DOI 10.1371/journal.pone.0136720

View details for PubMedID 26312489
Emergence of Hemagglutinin Mutations During the Course of Influenza Infection. Scientific reports Cushing, A., Kamali, A., Winters, M., Hopmans, E. S., Bell, J. M., Grimes, S. M., Xia, L. C., Zhang, N. R., Moss, R. B., Holodniy, M., Ji, H. P. 2015; 5: 16178-?

Abstract

Influenza remains a significant cause of disease mortality. The ongoing threat of influenza infection is partly attributable to the emergence of new mutations in the influenza genome. Among the influenza viral gene products, the hemagglutinin (HA) glycoprotein plays a critical role in influenza pathogenesis, is the target for vaccines and accumulates new mutations that may alter the efficacy of immunization. To study the emergence of HA mutations during the course of infection, we employed a deep-targeted sequencing method. We used samples from 17 patients with active H1N1 or H3N2 influenza infections. These patients were not treated with antivirals. In addition, we had samples from five patients who were analyzed longitudinally. Thus, we determined the quantitative changes in the fractional representation of HA mutations during the course of infection. Across individuals in the study, a series of novel HA mutations directly altered the HA coding sequence were identified. Serial viral sampling revealed HA mutations that either were stable, expanded or were reduced in representation during the course of the infection. Overall, we demonstrated the emergence of unique mutations specific to an infected individual and temporal genetic variation during infection.

View details for DOI 10.1038/srep16178

View details for PubMedID 26538451
The Cancer Genome Atlas Clinical Explorer: a web and mobile interface for identifying clinical-genomic driver associations. Genome medicine Lee, H., Palm, J., Grimes, S. M., Ji, H. P. 2015; 7 (1): 112-?

Abstract

The Cancer Genome Atlas (TCGA) project has generated genomic data sets covering over 20 malignancies. These data provide valuable insights into the underlying genetic and genomic basis of cancer. However, exploring the relationship among TCGA genomic results and clinical phenotype remains a challenge, particularly for individuals lacking formal bioinformatics training. Overcoming this hurdle is an important step toward the wider clinical translation of cancer genomic/proteomic data and implementation of precision cancer medicine. Several websites such as the cBio portal or University of California Santa Cruz genome browser make TCGA data accessible but lack interactive features for querying clinically relevant phenotypic associations with cancer drivers. To enable exploration of the clinical-genomic driver associations from TCGA data, we developed the Cancer Genome Atlas Clinical Explorer.The Cancer Genome Atlas Clinical Explorer interface provides a straightforward platform to query TCGA data using one of the following methods: (1) searching for clinically relevant genes, micro RNAs, and proteins by name, cancer types, or clinical parameters; (2) searching for genomic/proteomic profile changes by clinical parameters in a cancer type; or (3) testing two-hit hypotheses. SQL queries run in the background and results are displayed on our portal in an easy-to-navigate interface according to user's input. To derive these associations, we relied on elastic-net estimates of optimal multiple linear regularized regression and clinical parameters in the space of multiple genomic/proteomic features provided by TCGA data. Moreover, we identified and ranked gene/micro RNA/protein predictors of each clinical parameter for each cancer. The robustness of the results was estimated by bootstrapping. Overall, we identify associations of potential clinical relevance among genes/micro RNAs/proteins using our statistical analysis from 25 cancer types and 18 clinical parameters that include clinical stage or smoking history.The Cancer Genome Atlas Clinical Explorer enables the cancer research community and others to explore clinically relevant associations inferred from TCGA data. With its accessible web and mobile interface, users can examine queries and test hypothesis regarding genomic/proteomic alterations across a broad spectrum of malignancies.

View details for DOI 10.1186/s13073-015-0226-3

View details for PubMedID 26507825
Single-Color, Multiplexed, Droplet Digital PCR Analysis of the Clinical Significance of Hemizygous Loss of WRN Gene in Colorectal Cancer Lee, H., Lau, B., Zavala, N. A., Ji, H. P. ELSEVIER SCIENCE INC. 2014: 768

View details for Web of Science ID 000343794200314
A robust and rapid targeted sequencing technology for iterative multiple genomic features in cancer Lau, B., Cushing, A., Ji, H. AMER ASSOC CANCER RESEARCH. 2014

View details for DOI 10.1158/1538-7445.AM2014-3566

View details for Web of Science ID 000349910201063
Highly sensitive and specific digital quantification of cancer genetic aberrations Miotke, L. K., Lau, B., Rumma, R., Ji, H. AMER ASSOC CANCER RESEARCH. 2014

View details for DOI 10.1158/1538-7445.AM2014-1507

View details for Web of Science ID 000349906901236
Oncogenic transformation of diverse gastrointestinal tissues in primary organoid culture NATURE MEDICINE Li, X., Nadauld, L., Ootani, A., Corney, D. C., Pai, R. K., Gevaert, O., Cantrell, M. A., Rack, P. G., Neal, J. T., Chan, C. W., Yeung, T., Gong, X., Yuan, J., Wilhelmy, J., Robine, S., Attardi, L. D., Plevritis, S. K., Hung, K. E., Chen, C., Ji, H. P., Kuo, C. J. 2014; 20 (7): 769-777

Abstract

The application of primary organoid cultures containing epithelial and mesenchymal elements to cancer modeling holds promise for combining the accurate multilineage differentiation and physiology of in vivo systems with the facile in vitro manipulation of transformed cell lines. Here we used a single air-liquid interface culture method without modification to engineer oncogenic mutations into primary epithelial and mesenchymal organoids from mouse colon, stomach and pancreas. Pancreatic and gastric organoids exhibited dysplasia as a result of expression of Kras carrying the G12D mutation (Kras(G12D)), p53 loss or both and readily generated adenocarcinoma after in vivo transplantation. In contrast, primary colon organoids required combinatorial Apc, p53, Kras(G12D) and Smad4 mutations for progressive transformation to invasive adenocarcinoma-like histology in vitro and tumorigenicity in vivo, recapitulating multi-hit models of colorectal cancer (CRC), as compared to the more promiscuous transformation of small intestinal organoids. Colon organoid culture functionally validated the microRNA miR-483 as a dominant driver oncogene at the IGF2 (insulin-like growth factor-2) 11p15.5 CRC amplicon, inducing dysplasia in vitro and tumorigenicity in vivo. These studies demonstrate the general utility of a highly tractable primary organoid system for cancer modeling and driver oncogene validation in diverse gastrointestinal tissues.

View details for DOI 10.1038/nm.3585

View details for Web of Science ID 000338689500021
A programmable method for massively parallel targeted sequencing. Nucleic acids research Hopmans, E. S., Natsoulis, G., Bell, J. M., Grimes, S. M., Sieh, W., Ji, H. P. 2014; 42 (10)

Abstract

We have developed a targeted resequencing approach referred to as Oligonucleotide-Selective Sequencing. In this study, we report a series of significant improvements and novel applications of this method whereby the surface of a sequencing flow cell is modified in situ to capture specific genomic regions of interest from a sample and then sequenced. These improvements include a fully automated targeted sequencing platform through the use of a standard Illumina cBot fluidics station. Targeting optimization increased the yield of total on-target sequencing data 2-fold compared to the previous iteration, while simultaneously increasing the percentage of reads that could be mapped to the human genome. The described assays cover up to 1421 genes with a total coverage of 5.5 Megabases (Mb). We demonstrate a 10-fold abundance uniformity of greater than 90% in 1 log distance from the median and a targeting rate of up to 95%. We also sequenced continuous genomic loci up to 1.5 Mb while simultaneously genotyping SNPs and genes. Variants with low minor allele fraction were sensitively detected at levels of 5%. Finally, we determined the exact breakpoint sequence of cancer rearrangements. Overall, this approach has high performance for selective sequencing of genome targets, configuration flexibility and variant calling accuracy.

View details for DOI 10.1093/nar/gku282

View details for PubMedID 24782526
Oncogenic transformation of diverse gastrointestinal tissues in primary organoid culture. Nature medicine Li, X., Nadauld, L., Ootani, A., Corney, D. C., Pai, R. K., Gevaert, O., Cantrell, M. A., Rack, P. G., Neal, J. T., Chan, C. W., Yeung, T., Gong, X., Yuan, J., Wilhelmy, J., Robine, S., Attardi, L. D., Plevritis, S. K., Hung, K. E., Chen, C. Z., Ji, H. P., Kuo, C. J. 2014

Abstract

The application of primary organoid cultures containing epithelial and mesenchymal elements to cancer modeling holds promise for combining the accurate multilineage differentiation and physiology of in vivo systems with the facile in vitro manipulation of transformed cell lines. Here we used a single air-liquid interface culture method without modification to engineer oncogenic mutations into primary epithelial and mesenchymal organoids from mouse colon, stomach and pancreas. Pancreatic and gastric organoids exhibited dysplasia as a result of expression of Kras carrying the G12D mutation (Kras(G12D)), p53 loss or both and readily generated adenocarcinoma after in vivo transplantation. In contrast, primary colon organoids required combinatorial Apc, p53, Kras(G12D) and Smad4 mutations for progressive transformation to invasive adenocarcinoma-like histology in vitro and tumorigenicity in vivo, recapitulating multi-hit models of colorectal cancer (CRC), as compared to the more promiscuous transformation of small intestinal organoids. Colon organoid culture functionally validated the microRNA miR-483 as a dominant driver oncogene at the IGF2 (insulin-like growth factor-2) 11p15.5 CRC amplicon, inducing dysplasia in vitro and tumorigenicity in vivo. These studies demonstrate the general utility of a highly tractable primary organoid system for cancer modeling and driver oncogene validation in diverse gastrointestinal tissues.

View details for DOI 10.1038/nm.3585

View details for PubMedID 24859528
High sensitivity detection and quantitation of DNA copy number and single nucleotide variants with single color droplet digital PCR. Analytical chemistry Miotke, L., Lau, B. T., Rumma, R. T., Ji, H. P. 2014; 86 (5): 2618-2624

Abstract

In this study, we present a highly customizable method for quantifying copy number and point mutations utilizing a single-color, droplet digital PCR platform. Droplet digital polymerase chain reaction (ddPCR) is rapidly replacing real-time quantitative PCR (qRT-PCR) as an efficient method of independent DNA quantification. Compared to quantative PCR, ddPCR eliminates the needs for traditional standards; instead, it measures target and reference DNA within the same well. The applications for ddPCR are widespread including targeted quantitation of genetic aberrations, which is commonly achieved with a two-color fluorescent oligonucleotide probe (TaqMan) design. However, the overall cost and need for optimization can be greatly reduced with an alternative method of distinguishing between target and reference products using the nonspecific DNA binding properties of EvaGreen (EG) dye. By manipulating the length of the target and reference amplicons, we can distinguish between their fluorescent signals and quantify each independently. We demonstrate the effectiveness of this method by examining copy number in the proto-oncogene FLT3 and the common V600E point mutation in BRAF. Using a series of well-characterized control samples and cancer cell lines, we confirmed the accuracy of our method in quantifying mutation percentage and integer value copy number changes. As another novel feature, our assay was able to detect a mutation comprising less than 1% of an otherwise wild-type sample, as well as copy number changes from cancers even in the context of significant dilution with normal DNA. This flexible and cost-effective method of independent DNA quantification proves to be a robust alternative to the commercialized TaqMan assay.

View details for DOI 10.1021/ac403843j

View details for PubMedID 24483992
A phase II study of capecitabine, carboplatin, and bevacizumab for metastatic or unresectable gastroesophageal junction and gastric adenocarcinoma. Kunz, P. L., Nandoskar, P., Koontz, M., Ji, H., Ford, J. M., Balise, R. R., Kamaya, A., Rubin, D., Fisher, G. A. AMER SOC CLINICAL ONCOLOGY. 2014

View details for DOI 10.1200/jco.2014.32.3_suppl.115

View details for Web of Science ID 000333682100120
Metastatic tumor evolution and organoid modeling implicate TGFBR2 as a cancer driver in diffuse gastric cancer GENOME BIOLOGY Nadauld, L. D., Garcia, S., Natsoulis, G., Bell, J. M., Miotke, L., Hopmans, E. S., Xu, H., Pai, R. K., Palm, C., Regan, J. F., Chen, H., Flaherty, P., Ootani, A., Zhang, N. R., Ford, J. M., Kuo, C. J., Ji, H. P. 2014; 15 (8)

Abstract

Gastric cancer is the second-leading cause of global cancer deaths, with metastatic disease representing the primary cause of mortality. To identify candidate drivers involved in oncogenesis and tumor evolution, we conduct an extensive genome sequencing analysis of metastatic progression in a diffuse gastric cancer. This involves a comparison between a primary tumor from a hereditary diffuse gastric cancer syndrome proband and its recurrence as an ovarian metastasis.Both the primary tumor and ovarian metastasis have common biallelic loss-of-function of both the CDH1 and TP53 tumor suppressors, indicating a common genetic origin. While the primary tumor exhibits amplification of the Fibroblast growth factor receptor 2 (FGFR2) gene, the metastasis notably lacks FGFR2 amplification but rather possesses unique biallelic alterations of Transforming growth factor-beta receptor 2 (TGFBR2), indicating the divergent in vivo evolution of a TGFBR2-mutant metastatic clonal population in this patient. As TGFBR2 mutations have not previously been functionally validated in gastric cancer, we modeled the metastatic potential of TGFBR2 loss in a murine three-dimensional primary gastric organoid culture. The Tgfbr2 shRNA knockdown within Cdh1-/-; Tp53-/- organoids generates invasion in vitro and robust metastatic tumorigenicity in vivo, confirming Tgfbr2 metastasis suppressor activity.We document the metastatic differentiation and genetic heterogeneity of diffuse gastric cancer and reveal the potential metastatic role of TGFBR2 loss-of-function. In support of this study, we apply a murine primary organoid culture method capable of recapitulating in vivo metastatic gastric cancer. Overall, we describe an integrated approach to identify and functionally validate putative cancer drivers involved in metastasis.

View details for DOI 10.1186/s13059-014-0428-9

View details for Web of Science ID 000346604100009

View details for PubMedID 25315765

View details for PubMedCentralID PMC4145231
MendeLIMS: a web-based laboratory information management system for clinical genome sequencing. BMC bioinformatics Grimes, S. M., Ji, H. P. 2014; 15 (1): 290-?

View details for DOI 10.1186/1471-2105-15-290

View details for PubMedID 25159034
Identification of Insertion Deletion Mutations from Deep Targeted Resequencing. Journal of data mining in genomics & proteomics Natsoulis, G., Zhang, N., Welch, K., Bell, J., Ji, H. P. 2013; 4 (3)

Abstract

Taking advantage of the deep targeted sequencing capabilities of next generation sequencers, we have developed a novel two step insertion deletion (indel) detection algorithm (IDA) that can determine indels from single read sequences with high computational efficiency and sensitivity when indels are fractionally less compared to wild type reference sequence. First, it identifies candidate indel positions utilizing specific sequence alignment artifacts produced by rapid alignment programs. Second, it confirms the location of the candidate indel by using the Smith-Waterman (SW) algorithm on a restricted subset of Sequence reads. We demonstrate that IDA is applicable to indels of varying sizes from deep targeted sequencing data at low fractions where the indel is diluted by wild type sequence. Our algorithm is useful in detecting indel variants present at variable allelic frequencies such as may occur in heterozygotes and mixed normal-tumor tissue.

View details for PubMedID 24511426
RVD: a command-line program for ultrasensitive rare single nucleotide variant detection using targeted next-generation DNA resequencing. BMC research notes Cushing, A., Flaherty, P., Hopmans, E., Bell, J. M., Ji, H. P. 2013; 6: 206-?

Abstract

Rare single nucleotide variants play an important role in genetic diversity and heterogeneity of specific human disease. For example, an individual clinical sample can harbor rare mutations at minor frequencies. Genetic diversity within an individual clinical sample is oftentimes reflected in rare mutations. Therefore, detecting rare variants prior to treatment may prove to be a useful predictor for therapeutic response. Current rare variant detection algorithms using next generation DNA sequencing are limited by inherent sequencing error rate and platform availability.Here we describe an optimized implementation of a rare variant detection algorithm called RVD for use in targeted gene resequencing. RVD is available both as a command-line program and for use in MATLAB and estimates context-specific error using a beta-binomial model to call variants with minor allele frequency (MAF) as low as 0.1%. We show that RVD accepts standard BAM formatted sequence files. We tested RVD analysis on multiple Illumina sequencing platforms, among the most widely used DNA sequencing platforms.RVD meets a growing need for highly sensitive and specific tools for variant detection. To demonstrate the usefulness of RVD, we carried out a thorough analysis of the software's performance on synthetic and clinical virus samples sequenced on both an Illumina GAIIx and a MiSeq. We expect RVD can improve understanding the genetics and treatment of common viral diseases including influenza. RVD is available at the following URL:http://dna-discovery.stanford.edu/software/rvd/.

View details for DOI 10.1186/1756-0500-6-206

View details for PubMedID 23701658

View details for PubMedCentralID PMC3695852
Systematic genomic identification of colorectal cancer genes delineating advanced from early clinical stage and metastasis. BMC medical genomics Lee, H., Flaherty, P., Ji, H. P. 2013; 6: 54-?

Abstract

Colorectal cancer is the third leading cause of cancer deaths in the United States. The initial assessment of colorectal cancer involves clinical staging that takes into account the extent of primary tumor invasion, determining the number of lymph nodes with metastatic cancer and the identification of metastatic sites in other organs. Advanced clinical stage indicates metastatic cancer, either in regional lymph nodes or in distant organs. While the genomic and genetic basis of colorectal cancer has been elucidated to some degree, less is known about the identity of specific cancer genes that are associated with advanced clinical stage and metastasis.We compiled multiple genomic data types (mutations, copy number alterations, gene expression and methylation status) as well as clinical meta-data from The Cancer Genome Atlas (TCGA). We used an elastic-net regularized regression method on the combined genomic data to identify genetic aberrations and their associated cancer genes that are indicators of clinical stage. We ranked candidate genes by their regression coefficient and level of support from multiple assay modalities.A fit of the elastic-net regularized regression to 197 samples and integrated analysis of four genomic platforms identified the set of top gene predictors of advanced clinical stage, including: WRN, SYK, DDX5 and ADRA2C. These genetic features were identified robustly in bootstrap resampling analysis.We conducted an analysis integrating multiple genomic features including mutations, copy number alterations, gene expression and methylation. This integrated approach in which one considers all of these genomic features performs better than any individual genomic assay. We identified multiple genes that robustly delineate advanced clinical stage, suggesting their possible role in colorectal cancer metastatic progression.

View details for DOI 10.1186/1755-8794-6-54

View details for PubMedID 24308539
DETECTING MUTATIONS IN MIXED SAMPLE SEQUENCING DATA USING EMPIRICAL BAYES ANNALS OF APPLIED STATISTICS Muralidharan, O., Natsoulis, G., Bell, J., Ji, H., Zhang, N. R. 2012; 6 (3): 1047-1067

View details for DOI 10.1214/12-AOAS538

View details for Web of Science ID 000314457400010
Identification of a novel deletion mutant strain in Saccharomyces cerevisiae that results in a microsatellite instability phenotype. BioDiscovery Ji, H. P., Morales, S., Welch, K., Yuen, C., Farnam, K., Ford, J. M. 2012

Abstract

The DNA mismatch repair (MMR) pathway corrects specific types of DNA replication errors that affect microsatellites and thus is critical for maintaining genomic integrity. The genes of the MMR pathway are highly conserved across different organisms. Likewise, defective MMR function universally results in microsatellite instability (MSI) which is a hallmark of certain types of cancer associated with the Mendelian disorder hereditary nonpolyposis colorectal cancer. (Lynch syndrome). To identify previously unrecognized deleted genes or loci that can lead to MSI, we developed a functional genomics screen utilizing a plasmid containing a microsatellite sequence that is a host spot for MSI mutations and the comprehensive homozygous diploid deletion mutant resource for Saccharomyces cerevisiae. This pool represents a collection of non-essential homozygous yeast diploid (2N) mutants in which there are deletions for over four thousand yeast open reading frames (ORFs). From our screen, we identified a deletion mutant strain of the PAU24 gene that leads to MSI. In a series of validation experiments, we determined that this PAU24 mutant strain had an increased MSI-specific mutation rate in comparison to the original background wildtype strain, other deletion mutants and comparable to a MMR mutant involving the MLH1 gene. Likewise, in yeast strains with a deletion of PAU24, we identified specific de novo indel mutations that occurred within the targeted microsatellite used for this screen.

View details for PubMedID 23667739
Improving bioinformatic pipelines for exome variant calling GENOME MEDICINE Ji, H. P. 2012; 4

Abstract

Exome sequencing analysis is a cost-effective approach for identifying variants in coding regions. However, recognizing the relevant single nucleotide variants, small insertions and deletions remains a challenge for many researchers and diagnostic laboratories typically do not have access to the bioinformatic analysis pipelines necessary for clinical application. The Atlas2 suite, recently released by Baylor Genome Center, is designed to be widely accessible, runs on desktop computers but is scalable to computational clusters, and performs comparably with other popular variant callers. Atlas2 may be an accessible alternative for data processing when a rapid solution for variant calling is required.See research article http://www.biomedcentral.com/1471-2105/13/8.

View details for DOI 10.1186/gm306

View details for Web of Science ID 000314564600001

View details for PubMedID 22289516

View details for PubMedCentralID PMC3334555
The Human OligoGenome Resource: a database of oligonucleotide capture probes for resequencing target regions across the human genome. Nucleic acids research Newburger, D. E., Natsoulis, G., Grimes, S., Bell, J. M., Davis, R. W., Batzoglou, S., Ji, H. P. 2012; 40 (Database issue): D1137-43

Abstract

Recent exponential growth in the throughput of next-generation DNA sequencing platforms has dramatically spurred the use of accessible and scalable targeted resequencing approaches. This includes candidate region diagnostic resequencing and novel variant validation from whole genome or exome sequencing analysis. We have previously demonstrated that selective genomic circularization is a robust in-solution approach for capturing and resequencing thousands of target human genome loci such as exons and regulatory sequences. To facilitate the design and production of customized capture assays for any given region in the human genome, we developed the Human OligoGenome Resource (http://oligogenome.stanford.edu/). This online database contains over 21 million capture oligonucleotide sequences. It enables one to create customized and highly multiplexed resequencing assays of target regions across the human genome and is not restricted to coding regions. In total, this resource provides 92.1% in silico coverage of the human genome. The online server allows researchers to download a complete repository of oligonucleotide probes and design customized capture assays to target multiple regions throughout the human genome. The website has query tools for selecting and evaluating capture oligonucleotides from specified genomic regions.

View details for DOI 10.1093/nar/gkr973

View details for PubMedID 22102592

View details for PubMedCentralID PMC3245143
Performance comparison of whole-genome sequencing platforms NATURE BIOTECHNOLOGY Lam, H. Y., Clark, M. J., Chen, R., Chen, R., Natsoulis, G., O'Huallachain, M., Dewey, F. E., Habegger, L., Ashley, E. A., Gerstein, M. B., Butte, A. J., Ji, H. P., Snyder, M. 2012; 30 (1): 78-U118

Abstract

Whole-genome sequencing is becoming commonplace, but the accuracy and completeness of variant calling by the most widely used platforms from Illumina and Complete Genomics have not been reported. Here we sequenced the genome of an individual with both technologies to a high average coverage of ∼76×, and compared their performance with respect to sequence coverage and calling of single-nucleotide variants (SNVs), insertions and deletions (indels). Although 88.1% of the ∼3.7 million unique SNVs were concordant between platforms, there were tens of thousands of platform-specific calls located in genes and other genomic regions. In contrast, 26.5% of indels were concordant between platforms. Target enrichment validated 92.7% of the concordant SNVs, whereas validation by genotyping array revealed a sensitivity of 99.3%. The validation experiments also suggested that >60% of the platform-specific variants were indeed present in the genome. Our results have important implications for understanding the accuracy and completeness of the genome sequencing platforms.

View details for DOI 10.1038/nbt.2065

View details for Web of Science ID 000299110600023
The Human OligoGenome Resource: a database of oligonucleotide capture probes for resequencing target regions across the human genome NUCLEIC ACIDS RESEARCH Newburger, D. E., Natsoulis, G., Grimes, S., Bell, J. M., Davis, R. W., Batzoglou, S., Ji, H. P. 2012; 40 (D1): D1137-D1143

Abstract

Recent exponential growth in the throughput of next-generation DNA sequencing platforms has dramatically spurred the use of accessible and scalable targeted resequencing approaches. This includes candidate region diagnostic resequencing and novel variant validation from whole genome or exome sequencing analysis. We have previously demonstrated that selective genomic circularization is a robust in-solution approach for capturing and resequencing thousands of target human genome loci such as exons and regulatory sequences. To facilitate the design and production of customized capture assays for any given region in the human genome, we developed the Human OligoGenome Resource (http://oligogenome.stanford.edu/). This online database contains over 21 million capture oligonucleotide sequences. It enables one to create customized and highly multiplexed resequencing assays of target regions across the human genome and is not restricted to coding regions. In total, this resource provides 92.1% in silico coverage of the human genome. The online server allows researchers to download a complete repository of oligonucleotide probes and design customized capture assays to target multiple regions throughout the human genome. The website has query tools for selecting and evaluating capture oligonucleotides from specified genomic regions.

View details for DOI 10.1093/nar/gkr973

View details for Web of Science ID 000298601300170

View details for PubMedCentralID PMC3245143
A cross-sample statistical model for SNP detection in short-read sequencing data NUCLEIC ACIDS RESEARCH Muralidharan, O., Natsoulis, G., Bell, J., Newburger, D., Xu, H., Kela, I., Ji, H., Zhang, N. 2012; 40 (1)

Abstract

Highly multiplex DNA sequencers have greatly expanded our ability to survey human genomes for previously unknown single nucleotide polymorphisms (SNPs). However, sequencing and mapping errors, though rare, contribute substantially to the number of false discoveries in current SNP callers. We demonstrate that we can significantly reduce the number of false positive SNP calls by pooling information across samples. Although many studies prepare and sequence multiple samples with the same protocol, most existing SNP callers ignore cross-sample information. In contrast, we propose an empirical Bayes method that uses cross-sample information to learn the error properties of the data. This error information lets us call SNPs with a lower false discovery rate than existing methods.

View details for DOI 10.1093/nar/gkr851

View details for Web of Science ID 000298733500005

View details for PubMedID 22064853

View details for PubMedCentralID PMC3245949
Quantitative and Sensitive Detection of Cancer Genome Amplifications from Formalin Fixed Paraffin Embedded Tumors with Droplet Digital PCR. Translational medicine (Sunnyvale, Calif.) Nadauld, L., Regan, J. F., Miotke, L., Pai, R. K., Longacre, T. A., Kwok, S. S., Saxonov, S., Ford, J. M., Ji, H. P. 2012; 2 (2)

Abstract

For the analysis of cancer, there is great interest in rapid and accurate detection of cancer genome amplifications containing oncogenes that are potential therapeutic targets. The vast majority of cancer tissue samples are formalin fixed and paraffin embedded (FFPE) which enables histopathological examination and long term archiving. However, FFPE cancer genomic DNA is oftentimes degraded and generally a poor substrate for many molecular biology assays. To overcome the issues of poor DNA quality from FFPE samples and detect oncogenic copy number amplifications with high accuracy and sensitivity, we developed a novel approach. Our assay requires nanogram amounts of genomic DNA, thus facilitating study of small amounts of clinical samples. Using droplet digital PCR (ddPCR), we can determine the relative copy number of specific genomic loci even in the presence of intermingled normal tissue. We used a control dilution series to determine the limits of detection for the ddPCR assay and report its improved sensitivity on minimal amounts of DNA compared to standard real-time PCR. To develop this approach, we designed an assay for the fibroblast growth factor receptor 2 gene (FGFR2) that is amplified in a gastric and breast cancers as well as others. We successfully utilized ddPCR to ascertain FGFR2 amplifications from FFPE-preserved gastrointestinal adenocarcinomas.

View details for PubMedID 23682346
Ultrasensitive detection of rare mutations using next-generation targeted resequencing NUCLEIC ACIDS RESEARCH Flaherty, P., Natsoulis, G., Muralidharan, O., Winters, M., Buenrostro, J., Bell, J., Brown, S., Holodniy, M., Zhang, N., Ji, H. P. 2012; 40 (1)

Abstract

With next-generation DNA sequencing technologies, one can interrogate a specific genomic region of interest at very high depth of coverage and identify less prevalent, rare mutations in heterogeneous clinical samples. However, the mutation detection levels are limited by the error rate of the sequencing technology as well as by the availability of variant-calling algorithms with high statistical power and low false positive rates. We demonstrate that we can robustly detect mutations at 0.1% fractional representation. This represents accurate detection of one mutant per every 1000 wild-type alleles. To achieve this sensitive level of mutation detection, we integrate a high accuracy indexing strategy and reference replication for estimating sequencing error variance. We employ a statistical model to estimate the error rate at each position of the reference and to quantify the fraction of variant base in the sample. Our method is highly specific (99%) and sensitive (100%) when applied to a known 0.1% sample fraction admixture of two synthetic DNA samples to validate our method. As a clinical application of this method, we analyzed nine clinical samples of H1N1 influenza A and detected an oseltamivir (antiviral therapy) resistance mutation in the H1N1 neuraminidase gene at a sample fraction of 0.18%.

View details for DOI 10.1093/nar/gkr861

View details for Web of Science ID 000298733500002

View details for PubMedID 22013163

View details for PubMedCentralID PMC3245950
Targeted sequencing library preparation by genomic DNA circularization BMC BIOTECHNOLOGY Myllykangas, S., Natsoulis, G., Bell, J. M., Ji, H. P. 2011; 11

Abstract

For next generation DNA sequencing, we have developed a rapid and simple approach for preparing DNA libraries of targeted DNA content. Current protocols for preparing DNA for next-generation targeted sequencing are labor-intensive, require large amounts of starting material, and are prone to artifacts that result from necessary PCR amplification of sequencing libraries. Typically, sample preparation for targeted NGS is a two-step process where (1) the desired regions are selectively captured and (2) the ends of the DNA molecules are modified to render them compatible with any given NGS sequencing platform.In this proof-of-concept study, we present an integrated approach that combines these two separate steps into one. Our method involves circularization of a specific genomic DNA molecule that directly incorporates the necessary components for conducting sequencing in a single assay and requires only one PCR amplification step. We also show that specific regions of the genome can be targeted and sequenced without any PCR amplification.We anticipate that these rapid targeted libraries will be useful for validation of variants and may have diagnostic application.

View details for DOI 10.1186/1472-6750-11-122

View details for Web of Science ID 000300427900001

View details for PubMedID 22168766

View details for PubMedCentralID PMC3280942
Efficient targeted resequencing of human germline and cancer genomes by oligonucleotide-selective sequencing NATURE BIOTECHNOLOGY Myllykangas, S., Buenrostro, J. D., Natsoulis, G., Bell, J. M., Ji, H. P. 2011; 29 (11): 1024-U95

Abstract

We describe an approach for targeted genome resequencing, called oligonucleotide-selective sequencing (OS-Seq), in which we modify the immobilized lawn of oligonucleotide primers of a next-generation DNA sequencer to function as both a capture and sequencing substrate. We apply OS-Seq to resequence the exons of either 10 or 344 cancer genes from human DNA samples. In our assessment of capture performance, >87% of the captured sequence originated from the intended target region with sequencing coverage falling within a tenfold range for a majority of all targets. Single nucleotide variants (SNVs) called from OS-Seq data agreed with >95% of variants obtained from whole-genome sequencing of the same individual. We also demonstrate mutation discovery from a colorectal cancer tumor sample matched with normal tissue. Overall, we show the robust performance and utility of OS-Seq for the resequencing analysis of human germline and cancer genomes.

View details for DOI 10.1038/nbt.1996

View details for Web of Science ID 000296801300024

View details for PubMedID 22020387
A Flexible Approach for Highly Multiplexed Candidate Gene Targeted Resequencing PLOS ONE Natsoulis, G., Bell, J. M., Xu, H., Buenrostro, J. D., Ordonez, H., Grimes, S., Newburger, D., Jensen, M., Zahn, J. M., Zhang, N., Ji, H. P. 2011; 6 (6)

Abstract

We have developed an integrated strategy for targeted resequencing and analysis of gene subsets from the human exome for variants. Our capture technology is geared towards resequencing gene subsets substantially larger than can be done efficiently with simplex or multiplex PCR but smaller in scale than exome sequencing. We describe all the steps from the initial capture assay to single nucleotide variant (SNV) discovery. The capture methodology uses in-solution 80-mer oligonucleotides. To provide optimal flexibility in choosing human gene targets, we designed an in silico set of oligonucleotides, the Human OligoExome, that covers the gene exons annotated by the Consensus Coding Sequencing Project (CCDS). This resource is openly available as an Internet accessible database where one can download capture oligonucleotides sequences for any CCDS gene and design custom capture assays. Using this resource, we demonstrated the flexibility of this assay by custom designing capture assays ranging from 10 to over 100 gene targets with total capture sizes from over 100 Kilobases to nearly one Megabase. We established a method to reduce capture variability and incorporated indexing schemes to increase sample throughput. Our approach has multiple applications that include but are not limited to population targeted resequencing studies of specific gene subsets, validation of variants discovered in whole genome sequencing surveys and possible diagnostic analysis of disease gene subsets. We also present a cost analysis demonstrating its cost-effectiveness for large population studies.

View details for DOI 10.1371/journal.pone.0021088

View details for Web of Science ID 000292291800008

View details for PubMedID 21738606

View details for PubMedCentralID PMC3127857
Genetic-based biomarkers and next-generation sequencing: the future of personalized care in colorectal cancer PERSONALIZED MEDICINE Kim, R. Y., Xu, H., Myllykangas, S., Ji, H. 2011; 8 (3): 331-345

Abstract

The past 5 years have witnessed extraordinary advances in the field of DNA sequencing technology. What once took years to accomplish with Sanger sequencing can now be accomplished in a matter of days with next-generation sequencing (NGS) technology. This has allowed researchers to sequence individual genomes and match combinations of mutations with specific diseases. As cancer is inherently a disease of the genome, it is not surprising to see NGS technology already being applied to cancer research with promises of greater understanding of carcinogenesis. While the task of deciphering the cancer genomic code remains ongoing, we are already beginning to see the application of genetic-based testing in the area of colorectal cancer. In this article we will provide an overview of current colorectal cancer genetic-based biomarkers, namely mutations and other genetic alterations in cancer genome DNA, discuss recent advances in NGS technology and speculate on future directions for the application of NGS technology to colorectal cancer diagnosis and treatment.

View details for DOI 10.2217/PME.11.16

View details for Web of Science ID 000291444800013

View details for PubMedCentralID PMC3646399
Genetic-based biomarkers and next-generation sequencing: the future of personalized care in colorectal cancer. Personalized medicine Kim, R. Y., Xu, H., Myllykangas, S., Ji, H. 2011; 8 (3): 331-345

Abstract

The past 5 years have witnessed extraordinary advances in the field of DNA sequencing technology. What once took years to accomplish with Sanger sequencing can now be accomplished in a matter of days with next-generation sequencing (NGS) technology. This has allowed researchers to sequence individual genomes and match combinations of mutations with specific diseases. As cancer is inherently a disease of the genome, it is not surprising to see NGS technology already being applied to cancer research with promises of greater understanding of carcinogenesis. While the task of deciphering the cancer genomic code remains ongoing, we are already beginning to see the application of genetic-based testing in the area of colorectal cancer. In this article we will provide an overview of current colorectal cancer genetic-based biomarkers, namely mutations and other genetic alterations in cancer genome DNA, discuss recent advances in NGS technology and speculate on future directions for the application of NGS technology to colorectal cancer diagnosis and treatment.

View details for DOI 10.2217/pme.11.16

View details for PubMedID 23662107

View details for PubMedCentralID PMC3646399
Identification of Novel LNK Mutations In Patients with Chronic Myeloproliferative Neoplasms and Related Disorders 52nd Annual Meeting and Exposition of the American-Society-of-Hematology (ASH) Oh, S. T., Zahn, J. M., Jones, C. D., Zhang, B., Loh, M. L., Kantarjian, H., Simonds, E. F., Bruggner, R. V., Abidi, P., Natsoulis, G., Bell, J., Buenrostro, J., Nolan, G. P., Zehnder, J. L., Ji, H. P., Gotlib, J. AMER SOC HEMATOLOGY. 2010: 143–44

View details for Web of Science ID 000289662200316
Detecting simultaneous changepoints in multiple sequences BIOMETRIKA Zhang, N. R., Siegmund, D. O., Ji, H., Li, J. Z. 2010; 97 (3): 631-645

Abstract

We discuss the detection of local signals that occur at the same location in multiple one-dimensional noisy sequences, with particular attention to relatively weak signals that may occur in only a fraction of the sequences. We propose simple scan and segmentation algorithms based on the sum of the chi-squared statistics for each individual sample, which is equivalent to the generalized likelihood ratio for a model where the errors in each sample are independent. The simple geometry of the statistic allows us to derive accurate analytic approximations to the significance level of such scans. The formulation of the model is motivated by the biological problem of detecting recurrent DNA copy number variants in multiple samples. We show using replicates and parent-child comparisons that pooling data across samples results in more accurate detection of copy number variants. We also apply the multisample segmentation algorithm to the analysis of a cohort of tumour samples containing complex nested and overlapping copy number aberrations, for which our method gives a sparse and intuitive cross-sample summary.

View details for DOI 10.1093/biomet/asq025

View details for Web of Science ID 000280904000008

View details for PubMedCentralID PMC3372242
Detecting simultaneous changepoints in multiple sequences. Biometrika Zhang, N. R., Siegmund, D. O., Ji, H., Li, J. Z. 2010; 97 (3): 631-645

Abstract

We discuss the detection of local signals that occur at the same location in multiple one-dimensional noisy sequences, with particular attention to relatively weak signals that may occur in only a fraction of the sequences. We propose simple scan and segmentation algorithms based on the sum of the chi-squared statistics for each individual sample, which is equivalent to the generalized likelihood ratio for a model where the errors in each sample are independent. The simple geometry of the statistic allows us to derive accurate analytic approximations to the significance level of such scans. The formulation of the model is motivated by the biological problem of detecting recurrent DNA copy number variants in multiple samples. We show using replicates and parent-child comparisons that pooling data across samples results in more accurate detection of copy number variants. We also apply the multisample segmentation algorithm to the analysis of a cohort of tumour samples containing complex nested and overlapping copy number aberrations, for which our method gives a sparse and intuitive cross-sample summary.

View details for DOI 10.1093/biomet/asq025

View details for PubMedID 22822250

View details for PubMedCentralID PMC3372242
Oncogenic BRAF Mutation with CDKN2A Inactivation Is Characteristic of a Subset of Pediatric Malignant Astrocytomas CANCER RESEARCH Schiffman, J. D., Hodgson, J. G., VandenBerg, S. R., Flaherty, P., Polley, M. C., Yu, M., Fisher, P. G., Rowitch, D. H., Ford, J. M., Berger, M. S., Ji, H., Gutmann, D. H., James, C. D. 2010; 70 (2): 512-519

Abstract

Malignant astrocytomas are a deadly solid tumor in children. Limited understanding of their underlying genetic basis has contributed to modest progress in developing more effective therapies. In an effort to identify such alterations, we performed a genome-wide search for DNA copy number aberrations (CNA) in a panel of 33 tumors encompassing grade 1 through grade 4 tumors. Genomic amplifications of 10-fold or greater were restricted to grade 3 and 4 astrocytomas and included the MDM4 (1q32), PDGFRA (4q12), MET (7q21), CMYC (8q24), PVT1 (8q24), WNT5B (12p13), and IGF1R (15q26) genes. Homozygous deletions of CDKN2A (9p21), PTEN (10q26), and TP53 (17p3.1) were evident among grade 2 to 4 tumors. BRAF gene rearrangements that were indicated in three tumors prompted the discovery of KIAA1549-BRAF fusion transcripts expressed in 10 of 10 grade 1 astrocytomas and in none of the grade 2 to 4 tumors. In contrast, an oncogenic missense BRAF mutation (BRAF(V600E)) was detected in 7 of 31 grade 2 to 4 tumors but in none of the grade 1 tumors. BRAF(V600E) mutation seems to define a subset of malignant astrocytomas in children, in which there is frequent concomitant homozygous deletion of CDKN2A (five of seven cases). Taken together, these findings highlight BRAF as a frequent mutation target in pediatric astrocytomas, with distinct types of BRAF alteration occurring in grade 1 versus grade 2 to 4 tumors.

View details for DOI 10.1158/0008-5472.CAN-09-1851

View details for Web of Science ID 000278485500011

View details for PubMedID 20068183

View details for PubMedCentralID PMC2851233
Targeted deep resequencing of the human cancer genome using next-generation technologies BIOTECHNOLOGY AND GENETIC ENGINEERING REVIEWS, VOL 27 Myllykangas, S., Ji, H. P. 2010; 27: 135-158

Abstract

Next-generation sequencing technologies have revolutionized our ability to identify genetic variants, either germline or somatic point mutations, that occur in cancer. Parallelization and miniaturization of DNA sequencing enables massive data throughput and for the first time, large-scale, nucleotide resolution views of cancer genomes can be achieved. Systematic, large-scale sequencing surveys have revealed that the genetic spectrum of mutations in cancers appears to be highly complex with numerous low frequency bystander somatic variations, and a limited number of common, frequently mutated genes. Large sample sizes and deeper resequencing are much needed in resolving clinical and biological relevance of the mutations as well as in detecting somatic variants in heterogeneous samples and cancer cell sub-populations. However, even with the next-generation sequencing technologies, the overwhelming size of the human genome and need for very high fold coverage represents a major challenge for up-scaling cancer genome sequencing projects. Assays to target, capture, enrich or partition disease-specific regions of the genome offer immediate solutions for reducing the complexity of the sequencing libraries. Integration of targeted DNA capture assays and next-generation deep resequencing improves the ability to identify clinically and biologically relevant mutations.

View details for Web of Science ID 000286179900006

View details for PubMedID 21415896
Identification of a biomarker panel using a multiplex proximity ligation assay improves accuracy of pancreatic cancer diagnosis JOURNAL OF TRANSLATIONAL MEDICINE Chang, S. T., Zahn, J. M., Horecka, J., Kunz, P. L., Ford, J. M., Fisher, G. A., Le, Q. T., Chang, D. T., Ji, H., Koong, A. C. 2009; 7

Abstract

Pancreatic cancer continues to prove difficult to clinically diagnose. Multiple simultaneous measurements of plasma biomarkers can increase sensitivity and selectivity of diagnosis. Proximity ligation assay (PLA) is a highly sensitive technique for multiplex detection of biomarkers in plasma with little or no interfering background signal.We examined the plasma levels of 21 biomarkers in a clinically defined cohort of 52 locally advanced (Stage II/III) pancreatic ductal adenocarcinoma cases and 43 age-matched controls using a multiplex proximity ligation assay. The optimal biomarker panel for diagnosis was computed using a combination of the PAM algorithm and logistic regression modeling. Biomarkers that were significantly prognostic for survival in combination were determined using univariate and multivariate Cox survival models.Three markers, CA19-9, OPN and CHI3L1, measured in multiplex were found to have superior sensitivity for pancreatic cancer vs. CA19-9 alone (93% vs. 80%). In addition, we identified two markers, CEA and CA125, that when measured simultaneously have prognostic significance for survival for this clinical stage of pancreatic cancer (p < 0.003).A multiplex panel assaying CA19-9, OPN and CHI3L1 in plasma improves accuracy of pancreatic cancer diagnosis. A panel assaying CEA and CA125 in plasma can predict survival for this clinical cohort of pancreatic cancer patients.

View details for DOI 10.1186/1479-5876-7-105

View details for PubMedID 20003342
ASSOCIATION OF 7Q34 COPY NUMBER GAINS AND KIAA1549-BRAF GENE FUSIONS WITH JUVENILE PILOCYTIC ASTROCYTOMA Hodgson, J., VandenBerg, S. R., James, C., Perry, A., Gutmann, D., Fisher, P., Ford, J., Ji, H., Schiffman, J. OXFORD UNIV PRESS INC. 2009: 960

View details for Web of Science ID 000272974100393
Molecular inversion probes reveal patterns of 9p21 deletion and copy number aberrations in childhood leukemia CANCER GENETICS AND CYTOGENETICS Schiffman, J. D., Wang, Y., McPherson, L. A., Welch, K., Zhang, N., Davis, R., Lacayo, N. J., Dahl, G. V., Faham, M., Ford, J. M., Ji, H. P. 2009; 193 (1): 9-18

Abstract

Childhood leukemia, which accounts for >30% of newly diagnosed childhood malignancies, is one of the leading causes of death for children with cancer. Genome-wide studies using microarray chips to identify copy number changes in human cancer are becoming more common. In this pilot study, 45 pediatric leukemia samples were analyzed for gene copy aberrations using novel molecular inversion probe (MIP) technology. Acute leukemia subtypes included precursor B-cell acute lymphoblastic leukemia (ALL) (n=23), precursor T-cell ALL (n=6), and acute myeloid leukemia (n=14). The MIP analysis identified 69 regions of recurring copy number changes, of which 41 have not been identified with other DNA microarray platforms. Copy number gains and losses were validated in 98% of clinical karyotypes and 100% of fluorescence in situ hybridization studies available. We report unique patterns of copy number loss in samples with 9p21.3 (CDKN2A) deletion in the precursor B-cell ALL patients, compared with the precursor T-cell ALL patients. MIPs represent an attractive technology for identifying novel copy number aberrations, validating previously reported copy number changes, and translating molecular findings into clinically relevant targets for further investigation.

View details for DOI 10.1016/j.cancergencyto.2009.03.005

View details for Web of Science ID 000268922900002

View details for PubMedID 19602459

View details for PubMedCentralID PMC2776674
Paired phospho-proteomic and genomic analyses reveal functionally distinct subclones in refractory pediatric acute myeloid leukemia Simonds, E., Schiffman, J., Gramatges, M., Dahl, G., Ford, J., Lacayo, N., Ji, H., Nolan, G. AMER ASSOC CANCER RESEARCH. 2009

View details for Web of Science ID 000209701805194
Disperse-a software system for design of selector probes for exon resequencing applications BIOINFORMATICS Stenberg, J., Zhang, M., Ji, H. 2009; 25 (5): 666-667

Abstract

Selector probes enable the amplification of many selected regions of the genome in multiplex. Disperse is a software pipeline that automates the procedure of designing selector probes for exon resequencing applications.Software and documentation is available at http://bioinformatics.org/disperse

View details for DOI 10.1093/bioinformatics/btp001

View details for Web of Science ID 000263834600018

View details for PubMedID 19158162

View details for PubMedCentralID PMC2647824
Molecular inversion probe assay for allelic quantitation. Methods in molecular biology (Clifton, N.J.) Ji, H., Welch, K. 2009; 556: 67-87

Abstract

Molecular inversion probe (MIP) technology has been demonstrated to be a robust platform for large-scale dual genotyping and copy number analysis. Applications in human genomic and genetic studies include the possibility of running dual germline genotyping and combined copy number variation ascertainment. MIPs analyze large numbers of specific genetic target sequences in parallel, relying on interrogation of a barcode tag, rather than direct hybridization of genomic DNA to an array. The MIP approach does not replace, but is complementary to many of the copy number technologies being performed today. Some specific advantages of MIP technology include: less DNA required (37 ng vs. 250 ng), DNA quality less important, more dynamic range (amplifications detected up to copy number 60), allele-specific information "cleaner" (less SNP cross-talk/contamination), and quality of markers better (fewer individual MIPs versus SNPs needed to identify copy number changes). MIPs can be considered a candidate gene (targeted whole genome) approach and can find specific areas of interest that otherwise may be missed with other methods.

View details for DOI 10.1007/978-1-60327-192-9_6

View details for PubMedID 19488872

View details for PubMedCentralID PMC2988579
Next-generation DNA sequencing NATURE BIOTECHNOLOGY Shendure, J., Ji, H. 2008; 26 (10): 1135-1145

Abstract

DNA sequence represents a single format onto which a broad range of biological phenomena can be projected for high-throughput data collection. Over the past three years, massively parallel DNA sequencing platforms have become widely available, reducing the cost of DNA sequencing by over two orders of magnitude, and democratizing the field by putting the sequencing capacity of a major genome center in the hands of individual investigators. These new technologies are rapidly evolving, and near-term challenges include the development of robust protocols for generating sequencing libraries, building effective new approaches to data-analysis, and often a rethinking of experimental design. Next-generation DNA sequencing has the potential to dramatically accelerate biological and biomedical research, by enabling the comprehensive analysis of genomes, transcriptomes and interactomes to become inexpensive, routine and widespread, rather than requiring significant production-scale efforts.

View details for DOI 10.1038/nbt1486

View details for Web of Science ID 000259926000028

View details for PubMedID 18846087
FOXM1 OVEREXPRESSION AND DNA AMPLIFICATION IN PEDIATRIC ASTROCYTOMAS Hodgson, G., Vandenberg, S., Fisher, P., Yu, M., James, C., Rowitch, D., Ford, J., Ji, H., Schiffman, J. OXFORD UNIV PRESS INC. 2008: 805–6

View details for Web of Science ID 000259854500180
Analysis of Genomic Instability in Colorectal Carcinoma Flaherty, P., Davis, R. W., Ji, H. FEDERATION AMER SOC EXP BIOL. 2008

View details for Web of Science ID 000208467801809
Gene-specific delineation of copy number aberrations in follicular lymphoma with molecular inversion probes 49th Annual Meeting of the American-Society-of-Hematology Ji, H. P., Welch, K. M., Wang, Y., Faham, M., Akasaka, T., Czerwinski, D., Davis, R. W., Levy, R. AMER SOC HEMATOLOGY. 2007: 766A–767A

View details for Web of Science ID 000251100803385
Molecular Inversion Probes (MIPs) identify novel areas of allelic imbalance in childhood leukemia Schiffman, J. D., Welch, K., Davis, R., Lacayo, N. J., Dahl, G. V., Wang, Y., Faham, M., Ford, J. M., Ji, H. P. AMER SOC HEMATOLOGY. 2007: 431A

View details for Web of Science ID 000251100801706
Adapting molecular inversion probe (MIP) technology for allele quantification in childhood leukemia Schiffman, J. D., Welch, K. M., Davis, R., Dahl, G. V., Lacayo, N. J., Faham, M., Ford, J. M., Ji, H. AMER SOC CLINICAL ONCOLOGY. 2007

View details for Web of Science ID 000455043702487
Multigene amplification and massively parallel sequencing for cancer mutation discovery PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Dahl, F., Stenberg, J., Fredriksson, S., Welch, K., Zhang, M., Nilsson, M., Bicknell, D., Bodmer, W. F., Davis, R. W., Ji, H. 2007; 104 (22): 9387-9392

Abstract

We have developed a procedure for massively parallel resequencing of multiple human genes by combining a highly multiplexed and target-specific amplification process with a high-throughput parallel sequencing technology. The amplification process is based on oligonucleotide constructs, called selectors, that guide the circularization of specific DNA target regions. Subsequently, the circularized target sequences are amplified in multiplex and analyzed by using a highly parallel sequencing-by-synthesis technology. As a proof-of-concept study, we demonstrate parallel resequencing of 10 cancer genes covering 177 exons with average sequence coverage per sample of 93%. Seven cancer cell lines and one normal genomic DNA sample were studied with multiple mutations and polymorphisms identified among the 10 genes. Mutations and polymorphisms in the TP53 gene were confirmed by traditional sequencing.

View details for DOI 10.1073/pnas.0702165104

View details for Web of Science ID 000246935700055

View details for PubMedID 17517648

View details for PubMedCentralID PMC1871563
Multiplex amplification of all coding sequences within 10 cancer genes by Gene-Collector NUCLEIC ACIDS RESEARCH Fredriksson, S., Baner, J., Dahl, F., Chu, A., Ji, H., Welch, K., Davis, R. W. 2007; 35 (7)

Abstract

Herein we present Gene-Collector, a method for multiplex amplification of nucleic acids. The procedure has been employed to successfully amplify the coding sequence of 10 human cancer genes in one assay with uniform abundance of the final products. Amplification is initiated by a multiplex PCR in this case with 170 primer pairs. Each PCR product is then specifically circularized by ligation on a Collector probe capable of juxtapositioning only the perfectly matched cognate primer pairs. Any amplification artifacts typically associated with multiplex PCR derived from the use of many primer pairs such as false amplicons, primer-dimers etc. are not circularized and degraded by exonuclease treatment. Circular DNA molecules are then further enriched by randomly primed rolling circle replication. Amplification was successful for 90% of the targeted amplicons as seen by hybridization to a custom resequencing DNA micro-array. Real-time quantitative PCR revealed that 96% of the amplification products were all within 4-fold of the average abundance. Gene-Collector has utility for numerous applications such as high throughput resequencing, SNP analyses, and pathogen detection.

View details for DOI 10.1093/nar/gkm078

View details for Web of Science ID 000246294700001

View details for PubMedID 17317684

View details for PubMedCentralID PMC1874629
Multiplexed protein detection by proximity ligation for cancer biomarker validation NATURE METHODS Fredriksson, S., Dixon, W., Ji, H., Koong, A. C., Mindrinos, M., Davis, R. W. 2007; 4 (4): 327-329

Abstract

We present a proximity ligation-based multiplexed protein detection procedure in which several selected proteins can be detected via unique nucleic-acid identifiers and subsequently quantified by real-time PCR. The assay requires a 1-microl sample, has low-femtomolar sensitivity as well as five-log linear range and allows for modular multiplexing without cross-reactivity. The procedure can use a single polyclonal antibody batch for each target protein, simplifying affinity-reagent creation for new biomarker candidates.

View details for DOI 10.1038/NMETH1020

View details for Web of Science ID 000245584900013

View details for PubMedID 17369836
Under-expression of Kalirin-7 increases iNOS activity in cultured cells and correlates to elevated iNOS activity in Alzheimer's disease hippocampus JOURNAL OF ALZHEIMERS DISEASE Youn, H., Ji, I., Ji, H. P., Markesbery, W. R., Ji, T. H. 2007; 12 (3): 271-281

Abstract

Recently, it has been reported that Kalirin gene transcripts are under-expressed in AD hippocampal specimens compared to the controls. The Kalirin gene generates a dozen Kalirin isoforms. Kalirin-7 is the predominant protein expressed in the adult brain and plays crucial roles in growth and maintenance of neurons. Yet its role in human diseases is unknown. We report that Kalirin-7 is significantly diminished both at the mRNA and protein levels in the hippocampus specimens from 19 AD patients compared to the specimens from 15 controls. Kalirin-7 associates with iNOS in the hippocampus, and therefore, Kalirin-7 is complexed with iNOS less in AD hippocampus extracts than in control hippocampus extracts. In cultured cells, Kalirin-7 associates with iNOS and down-regulates the enzyme activity. The down-regulation is attributed to the highly conserved 33 amino acid sequence, K(617) -H(649), of the 1,663 amino acids long Kalirin-7. Remarkably, the iNOS activity is considerably higher in the hippocampus specimens from AD patients than the specimens from 15 controls. These observations suggest that the under-expression of Kalirin-7 in AD hippocampus correlates to the elevated iNOS activity.

View details for Web of Science ID 000252300000009

View details for PubMedID 18057561
Reproducibility Probability Score - incorporating measurement variability across laboratories for gene selection NATURE BIOTECHNOLOGY Lin, G., He, X., Ji, H., Shi, L., Davis, R. W., Zhong, S. 2006; 24 (12): 1476-1477

View details for Web of Science ID 000242795800015

View details for PubMedID 17160039
Data quality in genomics and microarrays NATURE BIOTECHNOLOGY Ji, H., Davis, R. W. 2006; 24 (9): 1112-1113

View details for DOI 10.1038/nbt0906-1108

View details for Web of Science ID 000240495200031

View details for PubMedID 16964224

View details for PubMedCentralID PMC2943412
The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements NATURE BIOTECHNOLOGY Shi, L., Reid, L. H., Jones, W. D., Shippy, R., Warrington, J. A., Baker, S. C., Collins, P. J., de Longueville, F., Kawasaki, E. S., Lee, K. Y., Luo, Y., Sun, Y. A., Willey, J. C., Setterquist, R. A., Fischer, G. M., Tong, W., Dragan, Y. P., Dix, D. J., Frueh, F. W., Goodsaid, F. M., Herman, D., Jensen, R. V., Johnson, C. D., Lobenhofer, E. K., Puri, R. K., Scherf, U., Thierry-Mieg, J., Wang, C., Wilson, M., Wolber, P. K., Zhang, L., Amur, S., Bao, W., Barbacioru, C. C., Lucas, A. B., Bertholet, V., Boysen, C., Bromley, B., Brown, D., Brunner, A., Canales, R., Cao, X. M., Cebula, T. A., Chen, J. J., Cheng, J., Chu, T., Chudin, E., Corson, J., Corton, J. C., Croner, L. J., Davies, C., Davison, T. S., Delenstarr, G., Deng, X., Dorris, D., Eklund, A. C., Fan, X., Fang, H., Fulmer-Smentek, S., Fuscoe, J. C., Gallagher, K., Ge, W., Guo, L., Guo, X., Hager, J., Haje, P. K., Han, J., Han, T., Harbottle, H. C., Harris, S. C., Hatchwell, E., Hauser, C. A., Hester, S., Hong, H., Hurban, P., Jackson, S. A., Ji, H., Knight, C. R., Kuo, W. P., LeClerc, J. E., Levy, S., Li, Q., Liu, C., Liu, Y., Lombardi, M. J., Ma, Y., Magnuson, S. R., Maqsodi, B., McDaniel, T., Mei, N., Myklebost, O., Ning, B., Novoradovskaya, N., Orr, M. S., Osborn, T. W., Papallo, A., Patterson, T. A., Perkins, R. G., Peters, E. H., Peterson, R., Philips, K. L., Pine, P. S., Pusztai, L., Qian, F., Ren, H., Rosen, M., Rosenzweig, B. A., Samaha, R. R., Schena, M., Schroth, G. P., Shchegrova, S., Smith, D. D., Staedtler, F., Su, Z., Sun, H., Szallasi, Z., Tezak, Z., Thierry-Mieg, D., Thompson, K. L., Tikhonova, I., Turpaz, Y., Vallanat, B., Van, C., Walker, S. J., Wang, S. J., Wang, Y., Wolfinger, R., Wong, A., Wu, J., Xiao, C., Xie, Q., Xu, J., Yang, W., Zhang, L., Zhong, S., Zong, Y., Slikker, W. 2006; 24 (9): 1151-1161

Abstract

Over the last decade, the introduction of microarray technology has had a profound impact on gene expression research. The publication of studies with dissimilar or altogether contradictory results, obtained using different microarray platforms to analyze identical RNA samples, has raised concerns about the reliability of this technology. The MicroArray Quality Control (MAQC) project was initiated to address these concerns, as well as other performance and data analysis issues. Expression data on four titration pools from two distinct reference RNA samples were generated at multiple test sites using a variety of microarray-based and alternative technology platforms. Here we describe the experimental design and probe mapping efforts behind the MAQC project. We show intraplatform consistency across test sites as well as a high level of interplatform concordance in terms of genes identified as differentially expressed. This study provides a resource that represents an important first step toward establishing a framework for the use of microarrays in clinical and regulatory settings.

View details for DOI 10.1038/nbt1239

View details for Web of Science ID 000240495200036

View details for PubMedID 16964229

View details for PubMedCentralID PMC3272078
Molecular inversion probe analysis of gene copy alterations reveals distinct categories of colorectal carcinoma CANCER RESEARCH Ji, H., Kumm, J., Zhang, M., Farnam, K., Salari, K., Faham, M., Ford, J. M., Davis, R. W. 2006; 66 (16): 7910-7919

Abstract

Genomic instability is a major feature of neoplastic development in colorectal carcinoma and other cancers. Specific genomic instability events, such as deletions in chromosomes and other alterations in gene copy number, have potential utility as biologically relevant prognostic biomarkers. For example, genomic deletions on chromosome arm 18q are an indicator of colorectal carcinoma behavior and potentially useful as a prognostic indicator. Adapting a novel genomic technology called molecular inversion probes which can determine gene copy alterations, such as genomic deletions, we designed a set of probes to interrogate several hundred individual exons of >200 cancer genes with an overall distribution covering all chromosome arms. In addition, >100 probes were designed in close proximity of microsatellite markers on chromosome arm 18q. We analyzed a set of colorectal carcinoma cell lines and primary colorectal tumor samples for gene copy alterations and deletion mutations in exons. Based on clustering analysis, we distinguished the different categories of genomic instability among the colorectal cancer cell lines. Our analysis of primary tumors uncovered several distinct categories of colorectal carcinoma, each with specific patterns of 18q deletions and deletion mutations in specific genes. This finding has potential clinical ramifications given the application of 18q loss of heterozygosity events as a potential indicator for adjuvant treatment in stage II colorectal carcinoma.

View details for DOI 10.1158/0008-5472.CAN-06-0595

View details for PubMedID 16912164
Analysis of genomic DNA copy number alterations in chromosome arm 18q demonstrates distinct molecular categories of colorectal carcinoma. Ji, H., Zhang, M., Farnam, K., Salari, K., Davis, R., Ford, J. M. AMER SOC CLINICAL ONCOLOGY. 2006: 542S

View details for Web of Science ID 000239009403461
A functional assay for mutations in tumor suppressor genes caused by mismatch repair deficiency HUMAN MOLECULAR GENETICS Ji, H. P., King, M. C. 2001; 10 (24): 2737-2743

Abstract

The coding sequences of multiple human tumor suppressor genes include microsatellite sequences that are prone to mutations. Saccharomyces cerevisiae strains deficient in DNA mismatch repair (MMR) can be used to determine de novo mutation rates of these human tumor suppressor genes as well as any other gene sequence. Microsatellites in human TGFBR2, PTEN and APC genes were placed in yeast vectors and analyzed in isogenic yeast strains that were wild-type or deletion mutants for MSH2 or MLH1. In MMR-deficient strains, the vector containing the (A)(10) microsatellite sequence of TGFBR2 had a mutation rate (mutations/cell division) of 1.4 x 10(-4), compared to a mutation rate of 1.7 x 10(-6) in the wild-type strain. In MMR-deficient strains, mutation rates in PTEN and APC were also elevated above background levels. PTEN mutation rates were higher in both msh2 (4.4 x 10-5) and mlh1 strains (2.3 x 10-5). APC mutation rates in the msh2 strain (2.4 x 10-6) and the mlh1 strain (1.7 x 10-6) were also significantly, but less dramatically, elevated over background. Mutations selected for in the yeast screen were identical to those previously observed in human tumor samples with microsatellite instability (MSI). This functional assay has applicability in providing quantitative data about microsatellite mutation rates caused by MMR deficiency in any human tumor suppressor gene sequence. It can also be applied as a genetic screen to identify new genes that are vulnerable to such microsatellite mutations and thus may be involved in the neoplastic development of tumors with MSI.

View details for Web of Science ID 000172867500001

View details for PubMedID 11734538
Spondyloepimetaphyseal dysplasia with joint laxity (SEMDJL): Presentation in two unrelated patients in the United States AMERICAN JOURNAL OF MEDICAL GENETICS Smith, W., Ji, H. L., Mouradian, W., Pagon, R. A. 1999; 86 (3): 245-252

Abstract

This is a report of two North American patients with spondyloepimetaphyseal dysplasia with joint laxity, an uncommon autosomal recessive skeletal dysplasia rarely reported outside of South Africa. Patients with SEMDJL have vertebral abnormalities and ligamentous laxity that results in spinal misalignment and progressive severe kyphoscoliosis, thoracic asymmetry, and respiratory compromise resulting in early death. Nonaxial skeletal involvement includes elbow deformities with radial head dislocation, dislocated hips, clubbed feet, and tapered fingers with spatulate distal phalanges. Many affected children have an oval face, flat midface, prominent eyes with blue sclerae, and a long philtrum. Palatal abnormalities and congenital heart disease are also observed. Diagnosis in infancy may be difficult because many of the typical findings are not apparent early and only evolve over time. We review the physical and radiographic findings in two unrelated patients with this disorder in order to increase the awareness of this disorder, particularly for clinicians outside of South Africa.

View details for Web of Science ID 000082714300010

View details for PubMedID 10482874
Molecular classification of the inherited hamartoma polyposis syndromes: Clearing the muddied waters AMERICAN JOURNAL OF HUMAN GENETICS Eng, C., Ji, H. L. 1998; 62 (5): 1020-1022

View details for Web of Science ID 000073487000004

View details for PubMedID 9545417
Inherited mutations in PTEN that are associated with breast cancer, Cowden disease, and juvenile polyposis AMERICAN JOURNAL OF HUMAN GENETICS Lynch, E. D., OSTERMEYER, E. A., Lee, M. K., Arena, J. F., Ji, H. L., Dann, J., Swisshelm, K., Suchard, D., MACLEOD, P. M., KVINNSLAND, S., Gjertsen, B. T., Heimdal, K., Lubs, H., Moller, P., KING, M. C. 1997; 61 (6): 1254-1260

Abstract

PTEN, a protein tyrosine phosphatase with homology to tensin, is a tumor-suppressor gene on chromosome 10q23. Somatic mutations in PTEN occur in multiple tumors, most markedly glioblastomas. Germ-line mutations in PTEN are responsible for Cowden disease (CD), a rare autosomal dominant multiple-hamartoma syndrome. PTEN was sequenced from constitutional DNA from 25 families. Germ-line PTEN mutations were detected in all of five families with both breast cancer and CD, in one family with juvenile polyposis syndrome, and in one of four families with breast and thyroid tumors. In this last case, signs of CD were subtle and were diagnosed only in the context of mutation analysis. PTEN mutations were not detected in 13 families at high risk of breast and/or ovarian cancer. No PTEN-coding-sequence polymorphisms were detected in 70 independent chromosomes. Seven PTEN germ-line mutations occurred, five nonsense and two missense mutations, in six of nine PTEN exons. The wild-type PTEN allele was lost from renal, uterine, breast, and thyroid tumors from a single patient. Loss of PTEN expression was an early event, reflected in loss of the wild-type allele in DNA from normal tissue adjacent to the breast and thyroid tumors. In RNA from normal tissues from three families, mutant transcripts appeared unstable. Germ-line PTEN mutations predispose to breast cancer in association with CD, although the signs of CD may be subtle.

View details for Web of Science ID 000071555900007

View details for PubMedID 9399897

View details for PubMedCentralID PMC1716102
HOTSPOTS FOR UNSELECTED TY1 TRANSPOSITION EVENTS ON YEAST CHROMOSOME-III ARE NEAR TRANSFER-RNA GENES AND LTR SEQUENCES CELL Ji, H., Moore, D. P., BLOMBERG, M. A., Braiterman, L. T., Voytas, D. F., Natsoulis, G., Boeke, J. D. 1993; 73 (5): 1007-1018

Abstract

A collection of yeast strains bearing single marked Ty1 insertions on chromosome III was generated. Over 100 such insertions were physically mapped by pulsed-field gel electrophoresis. These insertions are very nonrandomly distributed. Thirty-two such insertions were cloned by the inverted PCR technique, and the flanking DNA sequences were determined. The sequenced insertions all fell within a few very limited regions of chromosome III. Most of these regions contained tRNA coding regions and/or LTRs of preexisting transposable elements. Open reading frames were disrupted at a far lower frequency than expected for random transposition. The results suggest that the Ty1 integration machinery can detect regions of the genome that may represent "safe havens" for insertion. These regions of the genome do not contain any special DNA sequences, nor do they behave as particularly good targets for Ty1 integration in vitro, suggesting that the targeted regions have special properties allowing specific recognition in vivo.

View details for Web of Science ID A1993LF06100016

View details for PubMedID 8388781

Hanlee P. Ji

Professor of Medicine (Oncology) and, by courtesy of Electrical Engineering

Medicine - Oncology

Web page: http://dna-discovery.stanford.edu

Clinical Focus

Academic Appointments

Administrative Appointments

Honors & Awards

Professional Education

Contact

Additional Clinical Info

Links

Current Research and Scholarly Interests

Clinical Trials

2025-26 Courses

2024-25 Courses

2023-24 Courses

2022-23 Courses

Stanford Advisees

Graduate and Fellowship Programs

All Publications

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract