co-Section Chief, Information Sciences in Imaging at Stanford, Radiology (2009 - Present)
M.S., Stanford University, Health Services Research (1996)
PhD, Stanford University, Electrical Engineering (1992)
B.E., The Cooper Union, Electrical Engineering (1985)
Current Research and Scholarly Interests
My research program focuses on computational modeling of cancer biology and cancer outcomes. My laboratory develops stochastic models of the natural history of cancer based on clinical research data. We estimate population-level outcomes under differing screening and treatment interventions. We also analyze genomic and proteomic cancer data in order to identify molecular networks that are perturbed in cancer initiation and progression and relate these perturbations to patient outcomes.
- Principles of Cancer Systems Biology
CBIO 243 (Spr)
Independent Studies (15)
- Bioengineering Problems and Experimental Investigation
BIOE 191 (Aut, Win, Spr, Sum)
- Biomedical Informatics Teaching Methods
BIOMEDIN 290 (Aut, Win, Spr, Sum)
- Directed Investigation
BIOE 392 (Aut, Win, Spr, Sum)
- Directed Reading and Research
BIOMEDIN 299 (Aut, Win, Spr, Sum)
- Directed Reading in Cancer Biology
CBIO 299 (Aut, Win, Spr, Sum)
- Directed Reading in Radiology
RAD 299 (Aut, Win, Spr, Sum)
- Directed Study
BIOE 391 (Aut, Win, Spr, Sum)
- Early Clinical Experience in Radiology
RAD 280 (Aut, Win, Spr, Sum)
- Graduate Research
CBIO 399 (Aut, Win, Spr, Sum)
- Graduate Research
RAD 399 (Aut, Win, Spr, Sum)
- Medical Scholars Research
BIOMEDIN 370 (Aut, Win, Spr, Sum)
- Medical Scholars Research
RAD 370 (Aut, Win, Spr, Sum)
- Ph.D. Research
CME 400 (Spr)
- Readings in Radiology Research
RAD 101 (Aut, Win, Spr, Sum)
- Undergraduate Research
RAD 199 (Aut, Win, Spr, Sum)
- Bioengineering Problems and Experimental Investigation
Prior Year Courses
- Principles of Cancer Systems Biology
CBIO 243 (Spr)
- Principles of Cancer Systems Biology
Graduate and Fellowship Programs
Biomedical Informatics (Phd Programs)
Improvements in observed and relative survival in follicular grade 1-2 lymphoma during 4 decades: the Stanford University experience.
2013; 122 (6): 981-987
Recent studies report an improvement in overall survival (OS) of patients with follicular lymphoma (FL). Previously untreated patients with grade 1-2 FL referred from 1960-2003 and treated at Stanford were identified. Four eras were considered: era 1, pre-anthracycline (1960-1975, n=180); era 2, anthracycline (1976-1986, n=426), era 3, aggressive chemotherapy/purine analogs (1987-1996, n=471) and era 4, rituximab (1997-2003, n=257). Clinical characteristics, patterns of care and survival outcomes were assessed. Observed OS was compared with the expected OS calculated from Berkeley Mortality Database life tables derived from population matched by gender and age at time of diagnosis. The median OS was 13.6 years. Age, gender and stage did not differ across the eras. Although primary treatment varied, event free survival after the first treatment did not differ between eras (p=0.17). Median OS improved from approximately 11 years in eras 1 and 2 to 18.4 years in era 3 and has not yet been reached for era 4 (p<0.001) with no suggestion of a plateau in any era. These improvements in OS exceeded improvements in survival in the general population during the same time period. Several factors, including better supportive care and effective therapies for relapsed disease, are likely responsible for this improvement.
View details for DOI 10.1182/blood-2013-03-491514
View details for PubMedID 23777769
- Identification of ovarian cancer driver genes by using module network integration of multi-omics data INTERFACE FOCUS 2013; 3 (4)
Feasibility evaluation of an online tool to guide decisions for BRCA1/2 mutation carriers
2013; 12 (1): 65-73
Women with BRCA1 or BRCA2 (BRCA1/2) mutations face difficult decisions about managing their high risks of breast and ovarian cancer. We developed an online tool to guide decisions about cancer risk reduction (available at: http://brcatool.stanford.edu ), and recruited patients and clinicians to test its feasibility. We developed questionnaires for women with BRCA1/2 mutations and clinicians involved in their care, incorporating the System Usability Scale (SUS) and the Center for Healthcare Evaluation Provider Satisfaction Questionnaire (CHCE-PSQ). We enrolled BRCA1/2 mutation carriers who were seen by local physicians or participating in a national advocacy organization, and we enrolled clinicians practicing at Stanford University and in the surrounding community. Forty BRCA1/2 mutation carriers and 16 clinicians participated. Both groups found the tool easy to use, with SUS scores of 82.5-85 on a scale of 1-100; we did not observe differences according to patient age or gene mutation. General satisfaction was high, with a mean score of 4.28 (standard deviation (SD) 0.96) for patients, and 4.38 (SD 0.89) for clinicians, on a scale of 1-5. Most patients (77.5 %) were comfortable using the tool at home. Both patients and clinicians agreed that the decision tool could improve patient-doctor encounters (mean scores 4.50 and 4.69, on a 1-5 scale). Patients and health care providers rated the decision tool highly on measures of usability and clinical relevance. These results will guide a larger study of the tool's impact on clinical decisions.
View details for DOI 10.1007/s10689-012-9577-8
View details for Web of Science ID 000314408700008
View details for PubMedID 23086584
Hierarchy in somatic mutations arising during genomic evolution and progression of follicular lymphoma.
2013; 121 (9): 1604-1611
Follicular lymphoma (FL) is currently incurable using conventional chemotherapy or immunotherapy regimes, compelling new strategies. Advances in high-throughput sequencing technologies that can reveal oncogenic pathways have stimulated interest in tailoring therapies toward actionable somatic mutations. However, for mutation-directed therapies to be most effective, the mutations must be uniformly present in evolved tumor cells as well as in the self-renewing tumor-cell precursors. Here, we show striking intratumoral clonal diversity within FL tumors in the representation of mutations in the majority of genes as revealed by whole exome sequencing of subpopulations. This diversity captures a clonal hierarchy, resolved using immunoglobulin somatic mutations and IGH-BCL2 translocations as a frame of reference and by comparing diagnosis and relapse tumor pairs, allowing us to distinguish early versus late genetic eventsduring lymphomagenesis. We provide evidence that IGH-BCL2 translocations and CREBBP mutations are early events, whereas MLL2 and TNFRSF14 mutations probably represent late events during disease evolution. These observations provide insight into which of the genetic lesions represent suitable candidates for targeted therapies.
View details for DOI 10.1182/blood-2012-09-457283
View details for PubMedID 23297126
Identifying master regulators of cancer and their downstream targets by integrating genomic and epigenomic features.
Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing
Vast amounts of molecular data characterizing the genome, epigenome and transcriptome are becoming available for a variety of cancers. The current challenge is to integrate these diverse layers of molecular biology information to create a more comprehensive view of key biological processes underlying cancer. We developed a biocomputational algorithm that integrates copy number, DNA methylation, and gene expression data to study master regulators of cancer and identify their targets. Our algorithm starts by generating a list of candidate driver genes based on the rationale that genes that are driven by multiple genomic events in a subset of samples are unlikely to be randomly deregulated. We then select the master regulators from the candidate driver and identify their targets by inferring the underlying regulatory network of gene expression. We applied our biocomputational algorithm to identify master regulators and their targets in glioblastoma multiforme (GBM) and serous ovarian cancer. Our results suggest that the expression of candidate drivers is more likely to be influenced by copy number variations than DNA methylation. Next, we selected the master regulators and identified their downstream targets using module networks analysis. As a proof-of-concept, we show that the GBM and ovarian cancer module networks recapitulate known processes in these cancers. In addition, we identify master regulators that have not been previously reported and suggest their likely role. In summary, focusing on genes whose expression can be explained by their genomic and epigenomic aberrations is a promising strategy to identify master regulators of cancer.
View details for PubMedID 23424118
TreeVis: A MATLAB-based tool for tree visualization
COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE
2013; 109 (1): 74-76
Network-based analyses of high-dimensional biological data often produce results in the form of tree structures. Generating easily interpretable layouts to visualize these tree structures is a non-trivial task. We present a new visualization algorithm to generate two-dimensional layouts for complex tree structures. Implementations in both MATLAB and R are provided.
View details for DOI 10.1016/j.cmpb.2012.08.008
View details for Web of Science ID 000312473300007
View details for PubMedID 23036855
Cross-Species Functional Analysis of Cancer-Associated Fibroblasts Identifies a Critical Role for CLCF1 and IL-6 in Non-Small Cell Lung Cancer In Vivo
2012; 72 (22): 5744-5756
Cancer-associated fibroblasts (CAF) have been reported to support tumor progression by a variety of mechanisms. However, their role in the progression of non-small cell lung cancer (NSCLC) remains poorly defined. In addition, the extent to which specific proteins secreted by CAFs contribute directly to tumor growth is unclear. To study the role of CAFs in NSCLCs, a cross-species functional characterization of mouse and human lung CAFs was conducted. CAFs supported the growth of lung cancer cells in vivo by secretion of soluble factors that directly stimulate the growth of tumor cells. Gene expression analysis comparing normal mouse lung fibroblasts and mouse lung CAFs identified multiple genes that correlate with the CAF phenotype. A gene signature of secreted genes upregulated in CAFs was an independent marker of poor survival in patients with NSCLC. This secreted gene signature was upregulated in normal lung fibroblasts after long-term exposure to tumor cells, showing that lung fibroblasts are "educated" by tumor cells to acquire a CAF-like phenotype. Functional studies identified important roles for CLCF1-CNTFR and interleukin (IL)-6-IL-6R signaling in promoting growth of NSCLCs. This study identifies novel soluble factors contributing to the CAF protumorigenic phenotype in NSCLCs and suggests new avenues for the development of therapeutic strategies.
View details for DOI 10.1158/0008-5472.CAN-12-1097
View details for Web of Science ID 000311141300012
View details for PubMedID 22962265
CytoSPADE: high-performance analysis and visualization of high-dimensional cytometry data
2012; 28 (18): 2400-2401
MOTIVATION: Recent advances in flow cytometry enable simultaneous single-cell measurement of 30+ surface and intracellular proteins. CytoSPADE is a high-performance implementation of an interface for the Spanning-tree Progression Analysis of Density-normalized Events algorithm for tree-based analysis and visualization of this high-dimensional cytometry data. AVAILABILITY: Source code and binaries are freely available at http://cytospade.org and via Bioconductor version 2.10 onwards for Linux, OSX and Windows. CytoSPADE is implemented in R, C++ and Java. CONTACT: firstname.lastname@example.org SUPPLEMENTARY INFORMATION: Additional documentation available at http://cytospade.org.
View details for DOI 10.1093/bioinformatics/bts425
View details for Web of Science ID 000308532300067
View details for PubMedID 22782546
Prognostic PET F-18-FDG Uptake Imaging Features Are Associated with Major Oncogenomic Alterations in Patients with Resected Non-Small Cell Lung Cancer
2012; 72 (15): 3725-3734
Although 2[18F]fluoro-2-deoxy-d-glucose (FDG) uptake during positron emission tomography (PET) predicts post-surgical outcome in patients with non-small cell lung cancer (NSCLC), the biologic basis for this observation is not fully understood. Here, we analyzed 25 tumors from patients with NSCLCs to identify tumor PET-FDG uptake features associated with gene expression signatures and survival. Fourteen quantitative PET imaging features describing FDG uptake were correlated with gene expression for single genes and coexpressed gene clusters (metagenes). For each FDG uptake feature, an associated metagene signature was derived, and a prognostic model was identified in an external cohort and then tested in a validation cohort of patients with NSCLC. Four of eight single genes associated with FDG uptake (LY6E, RNF149, MCM6, and FAP) were also associated with survival. The most prognostic metagene signature was associated with a multivariate FDG uptake feature [maximum standard uptake value (SUV(max)), SUV(variance), and SUV(PCA2)], each highly associated with survival in the external [HR, 5.87; confidence interval (CI), 2.49-13.8] and validation (HR, 6.12; CI, 1.08-34.8) cohorts, respectively. Cell-cycle, proliferation, death, and self-recognition pathways were altered in this radiogenomic profile. Together, our findings suggest that leveraging tumor genomics with an expanded collection of PET-FDG imaging features may enhance our understanding of FDG uptake as an imaging biomarker beyond its association with glycolysis.
View details for DOI 10.1158/0008-5472.CAN-11-3943
View details for Web of Science ID 000307354100004
View details for PubMedID 22710433
Non-Small Cell Lung Cancer: Identifying Prognostic Imaging Biomarkers by Leveraging Public Gene Expression Microarray Data-Methods and Preliminary Results
2012; 264 (2): 387-396
To identify prognostic imaging biomarkers in non-small cell lung cancer (NSCLC) by means of a radiogenomics strategy that integrates gene expression and medical images in patients for whom survival outcomes are not available by leveraging survival data in public gene expression data sets.A radiogenomics strategy for associating image features with clusters of coexpressed genes (metagenes) was defined. First, a radiogenomics correlation map is created for a pairwise association between image features and metagenes. Next, predictive models of metagenes are built in terms of image features by using sparse linear regression. Similarly, predictive models of image features are built in terms of metagenes. Finally, the prognostic significance of the predicted image features are evaluated in a public gene expression data set with survival outcomes. This radiogenomics strategy was applied to a cohort of 26 patients with NSCLC for whom gene expression and 180 image features from computed tomography (CT) and positron emission tomography (PET)/CT were available.There were 243 statistically significant pairwise correlations between image features and metagenes of NSCLC. Metagenes were predicted in terms of image features with an accuracy of 59%-83%. One hundred fourteen of 180 CT image features and the PET standardized uptake value were predicted in terms of metagenes with an accuracy of 65%-86%. When the predicted image features were mapped to a public gene expression data set with survival outcomes, tumor size, edge shape, and sharpness ranked highest for prognostic significance.This radiogenomics strategy for identifying imaging biomarkers may enable a more rapid evaluation of novel imaging modalities, thereby accelerating their translation to personalized medicine.
View details for DOI 10.1148/radiol.12111607
View details for Web of Science ID 000306660000010
View details for PubMedID 22723499
A Simulation Model to Predict the Impact of Prophylactic Surgery and Screening on the Life Expectancy of BRCA1 and BRCA2 Mutation Carriers
CANCER EPIDEMIOLOGY BIOMARKERS & PREVENTION
2012; 21 (7): 1066-1077
Women with inherited mutations in the BRCA1 or BRCA2 (BRCA1/2) genes are recommended to undergo a number of intensive cancer risk-reducing strategies, including prophylactic mastectomy, prophylactic oophorectomy, and screening. We estimate the impact of different risk-reducing options at various ages on life expectancy.We apply our previously developed Monte Carlo simulation model of screening and prophylactic surgery in BRCA1/2 mutation carriers. Here, we present the mathematical formulation to compute age-specific breast cancer incidence in the absence of prophylactic oophorectomy, which is an input to the simulation model, and provide sensitivity analysis on related model parameters.The greatest gains in life expectancy result from conducting prophylactic mastectomy and prophylactic oophorectomy immediately after BRCA1/2 mutation testing; these gains vary with age at testing, from 6.8 to 10.3 years for BRCA1 and 3.4 to 4.4 years for BRCA2 mutation carriers. Life expectancy gains from delaying prophylactic surgery by 5 to 10 years range from 1 to 9.9 years for BRCA1 and 0.5 to 4.2 years for BRCA2 mutation carriers. Adding annual breast screening provides gains of 2.0 to 9.9 years for BRCA1 and 1.5 to 4.3 years for BRCA2. Results were most sensitive to variations in our assumptions about the magnitude and duration of breast cancer risk reduction due to prophylactic oophorectomy.Life expectancy gains depend on the type of BRCA mutation and age at interventions. Sensitivity analysis identifies the degree of breast cancer risk reduction due to prophylactic oophorectomy as a key determinant of life expectancy gain.Further study of the impact of prophylactic oophorectomy on breast cancer risk in BRCA1/2 mutation carriers is warranted.
View details for DOI 10.1158/1055-9965.EPI-12-0149
View details for Web of Science ID 000306210100009
View details for PubMedID 22556274
Quantitative Proteomic Profiling Identifies Protein Correlates to EGFR Kinase Inhibition
MOLECULAR CANCER THERAPEUTICS
2012; 11 (5): 1071-1081
Clinical oncology is hampered by lack of tools to accurately assess a patient's response to pathway-targeted therapies. Serum and tumor cell surface proteins whose abundance, or change in abundance in response to therapy, differentiates patients responding to a therapy from patients not responding to a therapy could be usefully incorporated into tools for monitoring response. Here, we posit and then verify that proteomic discovery in in vitro tissue culture models can identify proteins with concordant in vivo behavior and further, can be a valuable approach for identifying tumor-derived serum proteins. In this study, we use stable isotope labeling of amino acids in culture (SILAC) with proteomic technologies to quantitatively analyze the gefitinib-related protein changes in a model system for sensitivity to EGF receptor (EGFR)-targeted tyrosine kinase inhibitors. We identified 3,707 intracellular proteins, 1,276 cell surface proteins, and 879 shed proteins. More than 75% of the proteins identified had quantitative information, and a subset consisting of 400 proteins showed a statistically significant change in abundance following gefitinib treatment. We validated the change in expression profile in vitro and screened our panel of response markers in an in vivo isogenic resistant model and showed that these were markers of gefitinib response and not simply markers of phospho-EGFR downregulation. In doing so, we also were able to identify which proteins might be useful as markers for monitoring response and which proteins might be useful as markers for a priori prediction of response.
View details for DOI 10.1158/1535-7163.MCT-11-0852
View details for Web of Science ID 000307984800003
View details for PubMedID 22411897
Online Tool to Guide Decisions for BRCA1/2 Mutation Carriers
JOURNAL OF CLINICAL ONCOLOGY
2012; 30 (5): 497-506
Women with BRCA1 or BRCA2 (BRCA1/2) mutations must choose between prophylactic surgeries and screening to manage their high risks of breast and ovarian cancer, comparing options in terms of cancer incidence, survival, and quality of life. A clinical decision tool could guide these complex choices.We built a Monte Carlo model for BRCA1/2 mutation carriers, simulating breast screening with annual mammography plus magnetic resonance imaging (MRI) from ages 25 to 69 years and prophylactic mastectomy (PM) and/or prophylactic oophorectomy (PO) at various ages. Modeled outcomes were cancer incidence, tumor features that shape treatment recommendations, overall survival, and cause-specific mortality. We adapted the model into an online tool to support shared decision making.We compared strategies on cancer incidence and survival to age 70 years; for example, PO plus PM at age 25 years optimizes both outcomes (incidence, 4% to 11%; survival, 80% to 83%), whereas PO at age 40 years plus MRI screening offers less effective prevention, yet similar survival (incidence, 36% to 57%; survival, 74% to 80%). To characterize patients' treatment and survivorship experiences, we reported the tumor features and treatments associated with risk-reducing interventions; for example, in most BRCA2 mutation carriers (81%), MRI screening diagnoses stage I, hormone receptor-positive breast cancers, which may not require chemotherapy.Cancer risk-reducing options for BRCA1/2 mutation carriers vary in their impact on cancer incidence, recommended treatments, quality of life, and survival. To guide decisions informed by multiple health outcomes, we provide an online tool for joint use by patients with their physicians (http://brcatool.stanford.edu).
View details for DOI 10.1200/JCO.2011.38.6060
View details for Web of Science ID 000302622900014
View details for PubMedID 22231042
Comparing the benefits of screening for breast cancer and lung cancer using a novel natural history model
CANCER CAUSES & CONTROL
2012; 23 (1): 175-185
To estimate the impact of early detection of cancer, knowledge of how quickly primary tumors grow and at what size they shed lethal metastases is critical. We developed a natural history model of cancer to estimate the probability of disease-specific cure as a function of tumor size, the tumor volume doubling time (TVDT), and disease-specific mortality reduction achievable by screening. The model was applied to non-small-cell lung carcinoma (NSCLC) and invasive ductal carcinoma (IDC), separately. Model parameter estimates were based on Surveillance Epidemiology and End Results (SEER) cancer registry datasets and validated on screening trials. Compared to IDC, NSCLC is estimated to have a lower probability of disease-specific cure at the same detected tumor size, shed lethal metastases at smaller sizes (median: 19 mm for IDC versus 8 mm for NSCLC), have a TVDT that is almost half as long (median: 252 days for IDC versus 134 days for NSCLC). Consequently, NSCLC is associated with a lower mortality reduction from screening at the same screen detection threshold and screening interval. In summary, using a similar natural history model of cancer, we quantify the disease-specific curability attributable to screening for breast cancer, and separately lung cancer, in terms of the TVDT and onset of lethal metastases.
View details for DOI 10.1007/s10552-011-9866-9
View details for Web of Science ID 000297757400017
View details for PubMedID 22116537
Reconstructing Directed Signed Gene Regulatory Network From Microarray Data
IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING
2011; 58 (12): 3518-3521
Great efforts have been made to develop both algorithms that reconstruct gene regulatory networks and systems that simulate gene networks and expression data, for the purpose of benchmarking network reconstruction algorithms. An interesting observation is that although many simulation systems chose to use Hill kinetics to generate data, none of the reconstruction algorithms were developed based on the Hill kinetics. One possible explanation is that, in Hill kinetics, activation and inhibition interactions take different mathematical forms, which brings additional combinatorial complexity into the reconstruction problem. We propose a new model that qualitatively behaves similar to the Hill kinetics, but has the same mathematical form for both activation and inhibition. We developed an algorithm to reconstruct gene networks based on this new model. Simulation results suggested a novel biological hypothesis that in gene knockout experiments, repressing protein synthesis to a certain extent may lead to better expression data and higher network reconstruction accuracy.
View details for DOI 10.1109/TBME.2011.2163188
View details for Web of Science ID 000297341500021
View details for PubMedID 21803675
Lymphomas that recur after MYC suppression continue to exhibit oncogene addiction
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA
2011; 108 (42): 17432-17437
The suppression of oncogenic levels of MYC is sufficient to induce sustained tumor regression associated with proliferative arrest, differentiation, cellular senescence, and/or apoptosis, a phenomenon known as oncogene addiction. However, after prolonged inactivation of MYC in a conditional transgenic mouse model of E?-tTA/tetO-MYC T-cell acute lymphoblastic leukemia, some of the tumors recur, recapitulating what is frequently observed in human tumors in response to targeted therapies. Here we report that these recurring lymphomas express either transgenic or endogenous Myc, albeit in many cases at levels below those in the original tumor, suggesting that tumors continue to be addicted to MYC. Many of the recurring lymphomas (76%) harbored mutations in the tetracycline transactivator, resulting in expression of the MYC transgene even in the presence of doxycycline. Some of the remaining recurring tumors expressed high levels of endogenous Myc, which was associated with a genomic rearrangement of the endogenous Myc locus or activation of Notch1. By gene expression profiling, we confirmed that the primary and recurring tumors have highly similar transcriptomes. Importantly, shRNA-mediated suppression of the high levels of MYC in recurring tumors elicited both suppression of proliferation and increased apoptosis, confirming that these tumors remain oncogene addicted. These results suggest that tumors induced by MYC remain addicted to overexpression of this oncogene.
View details for DOI 10.1073/pnas.1107303108
View details for Web of Science ID 000295975300044
View details for PubMedID 21969595
Modeling the impact of population screening on breast cancer mortality in the United States
2011; 20: S75-S81
Optimal US screening strategies remain controversial. We use six simulation models to evaluate screening outcomes under varying strategies.The models incorporate common data on incidence, mammography characteristics, and treatment effects. We evaluate varying initiation and cessation ages applied annually or biennially and calculate mammograms, mortality reduction (vs. no screening), false-positives, unnecessary biopsies and over-diagnosis.The lifetime risk of breast cancer death starting at age 40 is 3% and is reduced by screening. Screening biennially maintains 81% (range 67% to 99%) of annual screening benefits with fewer false-positives. Biennial screening from 50-74 reduces the probability of breast cancer death from 3% to 2.3%. Screening annually from 40 to 84 only lowers mortality an additional one-half of one percent to 1.8% but requires substantially more mammograms and yields more false-positives and over-diagnosed cases.Decisions about screening strategy depend on preferences for benefits vs. potential harms and resource considerations.
View details for Web of Science ID 000311077400013
View details for PubMedID 22015298
Extracting a cellular hierarchy from high-dimensional cytometry data with SPADE
2011; 29 (10): 886-U181
The ability to analyze multiple single-cell parameters is critical for understanding cellular heterogeneity. Despite recent advances in measurement technology, methods for analyzing high-dimensional single-cell data are often subjective, labor intensive and require prior knowledge of the biological system. To objectively uncover cellular heterogeneity from single-cell measurements, we present a versatile computational approach, spanning-tree progression analysis of density-normalized events (SPADE). We applied SPADE to flow cytometry data of mouse bone marrow and to mass cytometry data of human bone marrow. In both cases, SPADE organized cells in a hierarchy of related phenotypes that partially recapitulated well-described patterns of hematopoiesis. We demonstrate that SPADE is robust to measurement noise and to the choice of cellular markers. SPADE facilitates the analysis of cellular heterogeneity, the identification of cell types and comparison of functional markers in response to perturbations.
View details for DOI 10.1038/nbt.1991
View details for Web of Science ID 000296273000015
View details for PubMedID 21964415
Prediction of survival in diffuse large B-cell lymphoma based on the expression of 2 genes reflecting tumor and microenvironment
2011; 118 (5): 1350-1358
Several gene-expression signatures predict survival in diffuse large B-cell lymphoma (DLBCL), but the lack of practical methods for genome-scale analysis has limited translation to clinical practice. We built and validated a simple model using one gene expressed by tumor cells and another expressed by host immune cells, assessing added prognostic value to the clinical International Prognostic Index (IPI). LIM domain only 2 (LMO2) was validated as an independent predictor of survival and the "germinal center B cell-like" subtype. Expression of tumor necrosis factor receptor superfamily member 9 (TNFRSF9) from the DLBCL microenvironment was the best gene in bivariate combination with LMO2. Study of TNFRSF9 tissue expression in 95 patients with DLBCL showed expression limited to infiltrating T cells. A model integrating these 2 genes was independent of "cell-of-origin" classification, "stromal signatures," IPI, and added to the predictive power of the IPI. A composite score integrating these genes with IPI performed well in 3 independent cohorts of 545 DLBCL patients, as well as in a simple assay of routine formalin-fixed specimens from a new validation cohort of 147 patients with DLBCL. We conclude that the measurement of a single gene expressed by tumor cells (LMO2) and a single gene expressed by the immune microenvironment (TNFRSF9) powerfully predicts overall survival in patients with DLBCL.
View details for DOI 10.1182/blood-2011-03-345272
View details for Web of Science ID 000293510000028
View details for PubMedID 21670469
Single-Cell Mass Cytometry of Differential Immune and Drug Responses Across a Human Hematopoietic Continuum
2011; 332 (6030): 687-696
Flow cytometry is an essential tool for dissecting the functional complexity of hematopoiesis. We used single-cell "mass cytometry" to examine healthy human bone marrow, measuring 34 parameters simultaneously in single cells (binding of 31 antibodies, viability, DNA content, and relative cell size). The signaling behavior of cell subsets spanning a defined hematopoietic hierarchy was monitored with 18 simultaneous markers of functional signaling states perturbed by a set of ex vivo stimuli and inhibitors. The data set allowed for an algorithmically driven assembly of related cell types defined by surface antigen expression, providing a superimposable map of cell signaling responses in combination with drug inhibition. Visualized in this manner, the analysis revealed previously unappreciated instances of both precise signaling responses that were bounded within conventionally defined cell subsets and more continuous phosphorylation responses that crossed cell population boundaries in unexpected manners yet tracked closely with cellular phenotype. Collectively, such single-cell analyses provide system-wide views of immune signaling in healthy human hematopoiesis, against which drug action and disease can be compared for mechanistic studies and pharmacologic intervention.
View details for DOI 10.1126/science.1198704
View details for Web of Science ID 000290265800035
View details for PubMedID 21551058
Discovering Biological Progression Underlying Microarray Samples
PLOS COMPUTATIONAL BIOLOGY
2011; 7 (4)
In biological systems that undergo processes such as differentiation, a clear concept of progression exists. We present a novel computational approach, called Sample Progression Discovery (SPD), to discover patterns of biological progression underlying microarray gene expression data. SPD assumes that individual samples of a microarray dataset are related by an unknown biological process (i.e., differentiation, development, cell cycle, disease progression), and that each sample represents one unknown point along the progression of that process. SPD aims to organize the samples in a manner that reveals the underlying progression and to simultaneously identify subsets of genes that are responsible for that progression. We demonstrate the performance of SPD on a variety of microarray datasets that were generated by sampling a biological process at different points along its progression, without providing SPD any information of the underlying process. When applied to a cell cycle time series microarray dataset, SPD was not provided any prior knowledge of samples' time order or of which genes are cell-cycle regulated, yet SPD recovered the correct time order and identified many genes that have been associated with the cell cycle. When applied to B-cell differentiation data, SPD recovered the correct order of stages of normal B-cell differentiation and the linkage between preB-ALL tumor cells with their cell origin preB. When applied to mouse embryonic stem cell differentiation data, SPD uncovered a landscape of ESC differentiation into various lineages and genes that represent both generic and lineage specific processes. When applied to a prostate cancer microarray dataset, SPD identified gene modules that reflect a progression consistent with disease stages. SPD may be best viewed as a novel tool for synthesizing biological hypotheses because it provides a likely biological progression underlying a microarray dataset and, perhaps more importantly, the candidate genes that regulate that progression.
View details for DOI 10.1371/journal.pcbi.1001123
View details for Web of Science ID 000289973600007
View details for PubMedID 21533210
- Bayesian gene set analysis for identifying significant biological pathways JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES C-APPLIED STATISTICS 2011; 60: 541-557
Association of a Leukemic Stem Cell Gene Expression Signature With Clinical Outcomes in Acute Myeloid Leukemia
JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION
2010; 304 (24): 2706-2715
In many cancers, specific subpopulations of cells appear to be uniquely capable of initiating and maintaining tumors. The strongest support for this cancer stem cell model comes from transplantation assays in immunodeficient mice, which indicate that human acute myeloid leukemia (AML) is driven by self-renewing leukemic stem cells (LSCs). This model has significant implications for the development of novel therapies, but its clinical relevance has yet to be determined.To identify an LSC gene expression signature and test its association with clinical outcomes in AML.Retrospective study of global gene expression (microarray) profiles of LSC-enriched subpopulations from primary AML and normal patient samples, which were obtained at a US medical center between April 2005 and July 2007, and validation data sets of global transcriptional profiles of AML tumors from 4 independent cohorts (n = 1047).Identification of genes discriminating LSC-enriched populations from other subpopulations in AML tumors; and association of LSC-specific genes with overall, event-free, and relapse-free survival and with therapeutic response.Expression levels of 52 genes distinguished LSC-enriched populations from other subpopulations in cell-sorted AML samples. An LSC score summarizing expression of these genes in bulk primary AML tumor samples was associated with clinical outcomes in the 4 independent patient cohorts. High LSC scores were associated with worse overall, event-free, and relapse-free survival among patients with either normal karyotypes or chromosomal abnormalities. For the largest cohort of patients with normal karyotypes (n = 163), the LSC score was significantly associated with overall survival as a continuous variable (hazard ratio [HR], 1.15; 95% confidence interval [CI], 1.08-1.22; log-likelihood P <.001). The absolute risk of death by 3 years was 57% (95% CI, 43%-67%) for the low LSC score group compared with 78% (95% CI, 66%-86%) for the high LSC score group (HR, 1.9 [95% CI, 1.3-2.7]; log-rank P = .002). In another cohort with available data on event-free survival for 70 patients with normal karyotypes, the risk of an event by 3 years was 48% (95% CI, 27%-63%) in the low LSC score group vs 81% (95% CI, 60%-91%) in the high LSC score group (HR, 2.4 [95% CI, 1.3-4.5]; log-rank P = .006). In multivariate Cox regression including age, mutations in FLT3 and NPM1, and cytogenetic abnormalities, the HRs for LSC score in the 3 cohorts with data on all variables were 1.07 (95% CI, 1.01-1.13; P = .02), 1.10 (95% CI, 1.03-1.17; P = .005), and 1.17 (95% CI, 1.05-1.30; P = .005).High expression of an LSC gene signature is independently associated with adverse outcomes in patients with AML.
View details for Web of Science ID 000285518000015
View details for PubMedID 21177505
A Simulation Model Investigating the Impact of Tumor Volume Doubling Time and Mammographic Tumor Detectability on Screening Outcomes in Women Aged 40-49 Years
JOURNAL OF THE NATIONAL CANCER INSTITUTE
2010; 102 (16): 1263-1271
Compared with women aged 50-69 years, the lower sensitivity of mammographic screening in women aged 40-49 years is largely attributed to the lower mammographic tumor detectability and faster tumor growth in the younger women.We used a Monte Carlo simulation model of breast cancer screening by age to estimate the median tumor size detectable on a mammogram and the mean tumor volume doubling time. The estimates were calculated by calibrating the predicted breast cancer incidence rates to the actual rates from the Surveillance, Epidemiology, and End Results (SEER) database and the predicted distributions of screen-detected tumor sizes to the actual distributions obtained from the Breast Cancer Surveillance Consortium (BCSC). The calibrated parameters were used to estimate the relative impact of lower mammographic tumor detectability vs faster tumor volume doubling time on the poorer screening outcomes in younger women compared with older women. Mammography screening outcomes included sensitivity, mean tumor size at detection, lifetime gained, and breast cancer mortality. In addition, the relationship between screening sensitivity and breast cancer mortality was investigated as a function of tumor volume doubling time, mammographic tumor detectability, and screening interval.Lowered mammographic tumor detectability accounted for 79% and faster tumor volume doubling time accounted for 21% of the poorer sensitivity of mammography screening in younger women compared with older women. The relative contributions were similar when the impact of screening was evaluated in terms of mean tumor size at detection, lifetime gained, and breast cancer mortality. Screening sensitivity and breast cancer mortality reduction attributable to screening were almost linearly related when comparing annual or biennial screening with no screening. However, when comparing annual with biennial screening, the greatest reduction in breast cancer mortality attributable to screening did not correspond to the greatest gain in screening sensitivity and was more strongly affected by the mammographic tumor detectability than tumor volume doubling time.The age-specific differences in mammographic tumor detection contribute more than age-specific differences in tumor growth rates to the lowered performance of mammography screening in younger women.
View details for DOI 10.1093/jnci/djq271
View details for Web of Science ID 000281182500010
View details for PubMedID 20664027
Incidental Extracardiac Findings at Coronary CT: Clinical and Economic Impact
AMERICAN JOURNAL OF ROENTGENOLOGY
2010; 194 (6): 1531-1538
The purpose of this study was to evaluate the prevalence of incidental extracardiac findings on coronary CT, to determine the associated downstream resource utilization, and to estimate additional costs per patient related to the associated diagnostic workup.This retrospective study examined incidental extracardiac findings in 151 consecutive adults (69.5% men and 30.5% women; mean age, 54 years) undergoing coronary CT during a 7-year period. Incidental findings were recorded, and medical records were reviewed for downstream diagnostic examinations for a follow-up period of 1 year (minimum) to 7 years (maximum). Costs of further workup were estimated using 2009 Medicare average reimbursement figures.There were 102 incidental extracardiac findings in 43% (65/151) of patients. Fifty-two percent (53/102) of findings were potentially clinically significant, and 81% (43/53) of these findings were newly discovered. The radiology reports made specific follow-up recommendations for 36% (19/53) of new significant findings. Only 4% (6/151) of patients actually underwent follow-up imaging or intervention for incidental findings. One patient was found to have a malignancy that was subsequently treated. The average direct costs of additional diagnostic workup were $17.42 per patient screened (95% CI, $2.84-$32.00) and $438.39 per patient with imaging follow-up (95% CI, $301.47-$575.31).Coronary CT frequently reveals potentially significant incidental extracardiac abnormalities, yet radiologists recommend further evaluation in only one-third of cases. An even smaller fraction of cases receive further workup. The failure to follow-up abnormal incidental findings may result in missed opportunities to detect early disease, but also limits the short-term attributable costs.
View details for DOI 10.2214/AJR.09.3587
View details for Web of Science ID 000277948400016
View details for PubMedID 20489093
MiDReG: A method of mining developmentally regulated genes using Boolean implications
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA
2010; 107 (13): 5732-5737
We present a method termed mining developmentally regulated genes (MiDReG) to predict genes whose expression is either activated or repressed as precursor cells differentiate. MiDReG does not require gene expression data from intermediate stages of development. MiDReG is based on the gene expression patterns between the initial and terminal stages of the differentiation pathway, coupled with "if-then" rules (Boolean implications) mined from large-scale microarray databases. MiDReG uses two gene expression-based seed conditions that mark the initial and the terminal stages of a given differentiation pathway and combines the statistically inferred Boolean implications from these seed conditions to identify the relevant genes. The method was validated by applying it to B-cell development. The algorithm predicted 62 genes that are expressed after the KIT+ progenitor cell stage and remain expressed through CD19+ and AICDA+ germinal center B cells. qRT-PCR of 14 of these genes on sorted B-cell progenitors confirmed that the expression of 10 genes is indeed stably established during B-cell differentiation. Review of the published literature of knockout mice revealed that of the predicted genes, 63.4% have defects in B-cell differentiation and function and 22% have a role in the B cell according to other experiments, and the remaining 14.6% are not characterized. Therefore, our method identified novel gene candidates for future examination of their role in B-cell development. These data demonstrate the power of MiDReG in predicting functionally important intermediate genes in a given developmental pathway that is defined by a mutually exclusive gene expression pattern.
View details for DOI 10.1073/pnas.0913635107
View details for Web of Science ID 000276159500010
View details for PubMedID 20231483
Reducing the Computational Complexity of Information Theoretic Approaches for Reconstructing Gene Regulatory Networks
JOURNAL OF COMPUTATIONAL BIOLOGY
2010; 17 (2): 169-176
Information theoretic approaches are increasingly being used for reconstructing regulatory networks from microarray data. These approaches start by computing the pairwise mutual information (MI) between all gene pairs. The resulting MI matrix is then manipulated to identify regulatory relationships. A barrier to these approaches is the time-consuming step of computing the MI matrix. We present a method to reduce this computation time. We apply spectral analysis to re-order the genes, so that genes that share regulatory relationships are more likely to be placed close to each other. Then, using a "sliding window" approach with appropriate window size and step size, we compute the MI for the genes within the sliding window, and the remainder is assumed to be zero. Using both simulated data and microarray data, we demonstrate that our method does not incur performance loss in regions of high-precision and low-recall, while the computational time is significantly lowered. The proposed method can be used with any method that relies on the mutual information to reconstruct networks.
View details for DOI 10.1089/cmb.2009.0052
View details for Web of Science ID 000279271200005
View details for PubMedID 20078227
Survival Analysis of Cancer Risk Reduction Strategies for BRCA1/2 Mutation Carriers
JOURNAL OF CLINICAL ONCOLOGY
2010; 28 (2): 222-231
Women with BRCA1/2 mutations inherit high risks of breast and ovarian cancer; options to reduce cancer mortality include prophylactic surgery or breast screening, but their efficacy has never been empirically compared. We used decision analysis to simulate risk-reducing strategies in BRCA1/2 mutation carriers and to compare resulting survival probability and causes of death.We developed a Monte Carlo model of breast screening with annual mammography plus magnetic resonance imaging (MRI) from ages 25 to 69 years, prophylactic mastectomy (PM) at various ages, and/or prophylactic oophorectomy (PO) at ages 40 or 50 years in 25-year-old BRCA1/2 mutation carriers.With no intervention, survival probability by age 70 is 53% for BRCA1 and 71% for BRCA2 mutation carriers. The most effective single intervention for BRCA1 mutation carriers is PO at age 40, yielding a 15% absolute survival gain; for BRCA2 mutation carriers, the most effective single intervention is PM, yielding a 7% survival gain if performed at age 40 years. The combination of PM and PO at age 40 improves survival more than any single intervention, yielding 24% survival gain for BRCA1 and 11% for BRCA2 mutation carriers. PM at age 25 instead of age 40 offers minimal incremental benefit (1% to 2%); substituting screening for PM yields a similarly minimal decrement in survival (2% to 3%).Although PM at age 25 plus PO at age 40 years maximizes survival probability, substituting mammography plus MRI screening for PM seems to offer comparable survival. These results may guide women with BRCA1/2 mutations in their choices between prophylactic surgery and breast screening.
View details for DOI 10.1200/JCO.2009.22.7991
View details for Web of Science ID 000273418000010
View details for PubMedID 19996031
Effects of Mammography Screening Under Different Screening Schedules: Model Estimates of Potential Benefits and Harms
ANNALS OF INTERNAL MEDICINE
2009; 151 (10): 738-W247
Despite trials of mammography and widespread use, optimal screening policy is controversial.To evaluate U.S. breast cancer screening strategies.6 models using common data elements.National data on age-specific incidence, competing mortality, mammography characteristics, and treatment effects.A contemporary population cohort.Lifetime.Societal.20 screening strategies with varying initiation and cessation ages applied annually or biennially.Number of mammograms, reduction in deaths from breast cancer or life-years gained (vs. no screening), false-positive results, unnecessary biopsies, and overdiagnosis.The 6 models produced consistent rankings of screening strategies. Screening biennially maintained an average of 81% (range across strategies and models, 67% to 99%) of the benefit of annual screening with almost half the number of false-positive results. Screening biennially from ages 50 to 69 years achieved a median 16.5% (range, 15% to 23%) reduction in breast cancer deaths versus no screening. Initiating biennial screening at age 40 years (vs. 50 years) reduced mortality by an additional 3% (range, 1% to 6%), consumed more resources, and yielded more false-positive results. Biennial screening after age 69 years yielded some additional mortality reduction in all models, but overdiagnosis increased most substantially at older ages.Varying test sensitivity or treatment patterns did not change conclusions.Results do not include morbidity from false-positive results, patient knowledge of earlier diagnosis, or unnecessary treatment.Biennial screening achieves most of the benefit of annual screening with less harm. Decisions about the best strategy depend on program and individual objectives and the weight placed on benefits, harms, and resource considerations. Primary Funding Source: National Cancer Institute.
View details for Web of Science ID 000272145100007
View details for PubMedID 19920274
Modeling the transition of lung cancer from early to advanced stage
CANCER CAUSES & CONTROL
2009; 20 (9): 1559-1569
We present a stochastic parametric model of the natural history of lung cancer that predicts the primary tumor volume at the moment the disease transits from early to advanced stage. Our model also produces estimates for the probability of symptomatic detection as a function of tumor volume and clinical stage. We estimate model parameters by likelihood maximization using data from the Mayo Lung Project (MLP), which was a clinical trial that evaluated screening for lung cancer in the 1970s. Mayo Lung Project cancer cases reported in Stage III or greater, according to the 1979 AJCC staging for lung cancer, were considered advanced stage. Our estimator distinguishes between the cases detected because of clinical symptoms and cases detected by screening. For nonsmall cell lung cancer cases detected in MLP, we estimate that the median primary tumor diameter at the onset of advanced stage disease was 4.1 cm. In addition, we estimate that the rate of patients symptomatically detected with their disease increases as their primary tumor increases in size, and for patients with a primary tumor of a given size, the rate of symptomatic detection is 12.8 times greater among patients with advanced stage disease compared to patients with early stage disease.
View details for DOI 10.1007/s10552-009-9401-4
View details for Web of Science ID 000271198400003
View details for PubMedID 19629730
Ly6d marks the earliest stage of B-cell specification and identifies the branchpoint between B-cell and T-cell development
GENES & DEVELOPMENT
2009; 23 (20): 2376-2381
Common lymphoid progenitors (CLPs) clonally produce both B- and T-cell lineages, but have little myeloid potential in vivo. However, some studies claim that the upstream lymphoid-primed multipotent progenitor (LMPP) is the thymic seeding population, and suggest that CLPs are primarily B-cell-restricted. To identify surface proteins that distinguish functional CLPs from B-cell progenitors, we used a new computational method of Mining Developmentally Regulated Genes (MiDReG). We identified Ly6d, which divides CLPs into two distinct populations: one that retains full in vivo lymphoid potential and produces more thymocytes at early timepoints than LMPP, and another that behaves essentially as a B-cell progenitor.
View details for DOI 10.1101/gad.1836009
View details for Web of Science ID 000270849700004
View details for PubMedID 19833765
Simultaneous Class Discovery and Classification of Microarray Data Using Spectral Analysis
JOURNAL OF COMPUTATIONAL BIOLOGY
2009; 16 (7): 935-944
Classification methods are commonly divided into two categories: unsupervised and supervised. Unsupervised methods have the ability to discover new classes by grouping data into clusters or tree structures without using the class labels, but they carry the risk of producing noninterpretable results. On the other hand, supervised methods always find decision rules that discriminate samples with different class labels. However, the class label information plays such an important role that it confines supervised methods by defining the possible classes. Consequently, supervised methods do not have the ability to discover new classes. To overcome the limitations of unsupervised and supervised methods, we propose a new method, which utilizes the class labels to a less important role so as to perform class discovery and classification simultaneously. The proposed method is called SPACC (SPectral Analysis for Class discovery and Classification). In SPACC, the training samples are nodes of an undirected weighted network. Using spectral analysis, SPACC iteratively partitions the network into a top-down binary tree. Each partitioning step is unsupervised, and the class labels are only used to define the stopping criterion. When the partitioning ends, the training samples have been divided into several subsets, each corresponding to one class label. Because multiple subsets can correspond to the same class label, SPACC may identify biologically meaningful subclasses, and minimize the impact of outliers and mislabeled data. We demonstrate the effectiveness of SPACC for class discovery and classification on microarray data of lymphomas and leukemias. SPACC software is available at http://icbp.stanford.edu/software/SPACC/.
View details for DOI 10.1089/cmb.2008.0227
View details for Web of Science ID 000268172700005
View details for PubMedID 19580522
Fast calculation of pairwise mutual information for gene regulatory network reconstruction
COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE
2009; 94 (2): 177-180
We present a new software implementation to more efficiently compute the mutual information for all pairs of genes from gene expression microarrays. Computation of the mutual information is a necessary first step in various information theoretic approaches for reconstructing gene regulatory networks from microarray data. When the mutual information is estimated by kernel methods, computing the pairwise mutual information is quite time-consuming. Our implementation significantly reduces the computation time. For an example data set of 336 samples consisting of normal and malignant B-cells, with 9563 genes measured per sample, the current available software for ARACNE requires 142 hours to compute the mutual information for all gene pairs, whereas our algorithm requires 1.6 hours. The increased efficiency of our algorithm improves the feasibility of applying mutual information based approaches for reconstructing large regulatory networks.
View details for DOI 10.1016/j.cmpb.2008.11.003
View details for Web of Science ID 000264951300007
View details for PubMedID 19167129
- A Bayesian nonparametric method for model evaluation: application to genetic studies JOURNAL OF NONPARAMETRIC STATISTICS 2009; 21 (3): 379-396
Genomic and proteomic analysis reveals a threshold level of MYC required for tumor maintenance
2008; 68 (13): 5132-5142
MYC overexpression has been implicated in the pathogenesis of most types of human cancers. MYC is likely to contribute to tumorigenesis by its effects on global gene expression. Previously, we have shown that the loss of MYC overexpression is sufficient to reverse tumorigenesis. Here, we show that there is a precise threshold level of MYC expression required for maintaining the tumor phenotype, whereupon there is a switch from a gene expression program of proliferation to a state of proliferative arrest and apoptosis. Oligonucleotide microarray analysis and quantitative PCR were used to identify changes in expression in 3,921 genes, of which 2,348 were down-regulated and 1,573 were up-regulated. Critical changes in gene expression occurred at or near the MYC threshold, including genes implicated in the regulation of the G(1)-S and G(2)-M cell cycle checkpoints and death receptor/apoptosis signaling. Using two-dimensional protein analysis followed by mass spectrometry, phospho-flow fluorescence-activated cell sorting, and antibody arrays, we also identified changes at the protein level that contributed to MYC-dependent tumor regression. Proteins involved in mRNA translation decreased below threshold levels of MYC. Thus, at the MYC threshold, there is a loss of its ability to maintain tumorigenesis, with associated shifts in gene and protein expression that reestablish cell cycle checkpoints, halt protein translation, and promote apoptosis.
View details for DOI 10.1158/0008-5472.CAN-07-6192
View details for Web of Science ID 000257415300024
View details for PubMedID 18593912
Boolean implication networks derived from large scale, whole genome microarray datasets
2008; 9 (10)
We describe a method for extracting Boolean implications (if-then relationships) in very large amounts of gene expression microarray data. A meta-analysis of data from thousands of microarrays for humans, mice, and fruit flies finds millions of implication relationships between genes that would be missed by other methods. These relationships capture gender differences, tissue differences, development, and differentiation. New relationships are discovered that are preserved across all three species.
View details for Web of Science ID 000260587300020
View details for PubMedID 18973690
Extracting binary signals from microarray time-course data
NUCLEIC ACIDS RESEARCH
2007; 35 (11): 3705-3712
This article presents a new method for analyzing microarray time courses by identifying genes that undergo abrupt transitions in expression level, and the time at which the transitions occur. The algorithm matches the sequence of expression levels for each gene against temporal patterns having one or two transitions between two expression levels. The algorithm reports a P-value for the matching pattern of each gene, and a global false discovery rate can also be computed. After matching, genes can be sorted by the direction and time of transitions. Genes can be partitioned into sets based on the direction and time of change for further analysis, such as comparison with Gene Ontology annotations or binding site motifs. The method is evaluated on simulated and actual time-course data. On microarray data for budding yeast, it is shown that the groups of genes that change in similar ways and at similar times have significant and relevant Gene Ontology annotations.
View details for DOI 10.1093/nar/gkm284
View details for Web of Science ID 000247817500018
View details for PubMedID 17517782
Ductal pattern enhancement on magnetic resonance imaging of the breast due to ductal lavage
2007; 13 (3): 281-286
Our purpose is to describe the appearance of breast ductal enhancement found on magnetic resonance imaging (MRI) after breast ductal lavage (DL). We describe a novel etiology of enhancement in a ductal pattern on postcontrast MRI of the breast. Knowledge of the potential for breast MRI enhancement subsequent to DL, which can mimic the appearance of a pathologic lesion, is critical to the care of patients who undergo breast MRI and DL or other intraductal cannulation procedures.
View details for Web of Science ID 000245992200010
View details for PubMedID 17461903
A natural history model of stage progression applied to breast cancer
STATISTICS IN MEDICINE
2007; 26 (3): 581-595
Invasive breast cancer is commonly staged as local, regional or distant disease. We present a stochastic model of the natural history of invasive breast cancer that quantifies (1) the relative rate that the disease transitions from the local, regional to distant stages, (2) the tumour volume at the stage transitions and (3) the impact of symptom-prompted detection on the tumour size and stage of invasive breast cancer in a population not screened by mammography. By symptom-prompted detection, we refer to tumour detection that results when symptoms appear that prompt the patient to seek clinical care. The model assumes exponential tumour growth and volume-dependent hazard functions for the times to symptomatic detection and stage transitions. Maximum likelihood parameter estimates are obtained based on SEER data on the tumour size and stage of invasive breast cancer from patients who were symptomatically detected in the absence of screening mammography. Our results indicate that the rate of symptom-prompted detection is similar to the rate of transition from the local to regional stage and an order of magnitude larger than the rate of transition from the regional to distant stage. We demonstrate that, in the even absence of screening mammography, symptom-prompted detection has a large effect on reducing the occurrence of distant staged disease at initial diagnosis.
View details for DOI 10.1002/sim.2550
View details for Web of Science ID 000243511400009
View details for PubMedID 16598706
Cost-effectiveness of screening BRCA1/2 mutation carriers with breast magnetic resonance imaging
JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION
2006; 295 (20): 2374-2384
Women with inherited BRCA1/2 mutations are at high risk for breast cancer, which mammography often misses. Screening with contrast-enhanced breast magnetic resonance imaging (MRI) detects cancer earlier but increases costs and results in more false-positive scans.To evaluate the cost-effectiveness of screening BRCA1/2 mutation carriers with mammography plus breast MRI compared with mammography alone.A computer model that simulates the life histories of individual BRCA1/2 mutation carriers, incorporating the effects of mammographic and MRI screening was used. The accuracy of mammography and breast MRI was estimated from published data in high-risk women. Breast cancer survival in the absence of screening was based on the Surveillance, Epidemiology and End Results database of breast cancer patients diagnosed in the prescreening period (1975-1981), adjusted for the current use of adjuvant therapy. Utilization rates and costs of diagnostic and treatment interventions were based on a combination of published literature and Medicare payments for 2005.The survival benefit, incremental costs, and cost-effectiveness of MRI screening strategies, which varied by ages of starting and stopping MRI screening, were computed separately for BRCA1 and BRCA2 mutation carriers.Screening strategies that incorporate annual MRI as well as annual mammography have a cost per quality-adjusted life-year (QALY) gained ranging from less than 45,000 dollars to more than 700,000 dollars, depending on the ages selected for MRI screening and the specific BRCA mutation. Relative to screening with mammography alone, the cost per QALY gained by adding MRI from ages 35 to 54 years is 55,420 dollars for BRCA1 mutation carriers, 130,695 dollars for BRCA2 mutation carriers, and 98,454 dollars for BRCA2 mutation carriers who have mammographically dense breasts.Breast MRI screening is more cost-effective for BRCA1 than BRCA2 mutation carriers. The cost-effectiveness of adding MRI to mammography varies greatly by age.
View details for Web of Science ID 000237734400023
View details for PubMedID 16720823
A stochastic simulation model of U.S. breast cancer mortality trends from 1975 to 2000.
Journal of the National Cancer Institute. Monographs
We present a simulation model that predicts U.S. breast cancer mortality trends from 1975 to 2000 and quantifies the impact of screening mammography and adjuvant therapy on these trends. This model was developed within the Cancer Intervention and Surveillance Network (CISNET) consortium.A Monte Carlo simulation is developed to generate the life history of individual breast cancer patients by using CISNET base case inputs that describe the secular trend in breast cancer risk, dissemination patterns for screening mammography and adjuvant treatment, and death from causes other than breast cancer. The model generates the patient's age, tumor size and stage at detection, mode of detection, age at death, and cause of death (breast cancer versus other) based in part on assumptions on the natural history of breast cancer. Outcomes from multiple birth cohorts are summarized in terms of breast cancer mortality rates by calendar year.Predicted breast cancer mortality rates follow the general shape of U.S. breast cancer mortality rates from 1975 to 1995 but level off after 1995 as opposed to following an observed decline. Sensitivity analysis revealed that the impact adjuvant treatment may be underestimated given the lack of data on temporal variation in treatment efficacy.We developed a simulation model that uses CISNET base case inputs and closely, but not exactly, reproduces U.S. breast cancer mortality rates. Screening mammography and adjuvant therapy are shown to have both contributed to a decline in U.S. breast cancer mortality.
View details for PubMedID 17032898
Effect of screening and adjuvant therapy on mortality from breast cancer
NEW ENGLAND JOURNAL OF MEDICINE
2005; 353 (17): 1784-1792
We used modeling techniques to assess the relative and absolute contributions of screening mammography and adjuvant treatment to the reduction in breast-cancer mortality in the United States from 1975 to 2000.A consortium of investigators developed seven independent statistical models of breast-cancer incidence and mortality. All seven groups used the same sources to obtain data on the use of screening mammography, adjuvant treatment, and benefits of treatment with respect to the rate of death from breast cancer.The proportion of the total reduction in the rate of death from breast cancer attributed to screening varied in the seven models from 28 to 65 percent (median, 46 percent), with adjuvant treatment contributing the rest. The variability across models in the absolute contribution of screening was larger than it was for treatment, reflecting the greater uncertainty associated with estimating the benefit of screening.Seven statistical models showed that both screening mammography and treatment have helped reduce the rate of death from breast cancer in the United States.
View details for Web of Science ID 000232813000006
View details for PubMedID 16251534
- Decision analysis and simulation modeling for evaluating diagnostic tests on the basis of patient outcomes. AMERICAN JOURNAL OF ROENTGENOLOGY 2005; 185 (3): 581-590
The effect of age, race, tumor size, tumor grade, and disease stage on invasive ductal breast cancer survival in the USSEER database
BREAST CANCER RESEARCH AND TREATMENT
2005; 89 (1): 47-54
To examine the effect of patient and tumor characteristics on breast cancer survival as recorded in the U.S. National Cancer Institute's Surveillance, Epidemiology, and End Results (SEER) database from 1973 to 1998.A sample of 72,367 female cases from 1973 to 1998 aged 21-90 years with invasive ductal breast cancer were examined with Cox proportional hazards regression to determine the effect of age at diagnosis, race, tumor size, tumor grade, disease stage, and year of diagnosis on disease-specific survival.Larger tumor size and higher tumor grade were found to have large negative effects on survival. Blacks had a 47 % greater risk of death than whites. Year of diagnosis had a positive effect, with a 15 % reduction in risk for each decade in the time period under study. The effects of patient age and disease stage violated the proportional hazards assumption, with distant disease having much poorer short-term survival than one would expect from a proportional hazards model, and younger age groups matching or even falling below the survival rate of the oldest group over time.Tumor size, grade, race, and year of diagnosis all have significant constant effects on disease-specific survival in breast cancer, while the effects of age at diagnosis and disease stage have significant effects that vary over time.
View details for Web of Science ID 000227280200007
View details for PubMedID 15666196
Breast magnetic resonance image screening and ductal lavage in women at high genetic risk for breast carcinoma
2004; 100 (3): 479-489
Intensive screening is an alternative to prophylactic mastectomy in women at high risk for developing breast carcinoma. The current article reports preliminary results from a screening protocol using high-quality magnetic resonance imaging (MRI), ductal lavage (DL), clinical breast examination, and mammography to identify early malignancy and high-risk lesions in women at increased genetic risk of breast carcinoma.Women with inherited BRCA1 or BRCA2 mutations or women with a >10% risk of developing breast carcinoma at 10 years, as estimated by the Claus model, were eligible. Patients were accrued from September 2001 to May 2003. Enrolled patients underwent biannual clinical breast examinations and annual mammography, breast MRI, and DL.Forty-one women underwent an initial screen. Fifteen of 41 enrolled women (36.6%) either had undergone previous bilateral oophorectomy and/or were on tamoxifen at the time of the initial screen. One patient who was a BRCA1 carrier had high-grade ductal carcinoma in situ (DCIS) that was screen detected by MRI but that was missed on mammography. High-risk lesions that were screen detected by MRI in three women included radial scars and atypical lobular hyperplasia. DL detected seven women with cellular atypia, including one woman who had a normal MRI and mammogram.Breast MRI identified high-grade DCIS and high-risk lesions that were missed by mammography. DL detected cytologic atypia in a high-risk cohort. A larger screening trial is needed to determine which subgroups of high-risk women will benefit and whether the identification of malignant and high-risk lesions at an early stage will impact breast carcinoma incidence and mortality.
View details for DOI 10.1002/cncr.11926
View details for Web of Science ID 000188611400006
View details for PubMedID 14745863
Diversity of model approaches for breast cancer screening: a review of model assumptions by The Cancer Intervention and Surveillance Network (CISNET) Breast Cancer Groups
STATISTICAL METHODS IN MEDICAL RESEARCH
2004; 13 (6): 525-538
The National Cancer Institute-sponsored Cancer Intervention and Surveillance Network program on breast cancer is composed of seven research groups working largely independently to model the impact of screening and adjuvant therapy on breast cancer mortality trends in the US from 1975 to 2000. Each of the groups has chosen a different modeling methodology without purposeful attempt to be in contrast with each other. The seven groups have met biannually since November 2000 to discuss their methodology and results. This article investigates the differences in methodology. To facilitate this comparison, each of the groups submitted a description of their model into a uniformly structured web based 'model profiler'. Six of the seven models simulate a preclinical natural history that cannot be observed directly with parameters estimated from published evidence concerning screening and therapy effects. The remaining model regards published evidence on intervention effects as prior information and updates that with information from the US population in a Bayesian type analysis. In general, the differences between the models appear to be small, particularly among the models driven by natural history assumptions. However, we demonstrate that such apparently small differences can have a large impact on surveillance of population trends. We describe a systematic approach to evaluating differences in model assumptions and results, as well as differences in modeling culture underlying the differences in model structure and parameters.
View details for DOI 10.1191/0962280204sm381ra
View details for Web of Science ID 000225102100007
View details for PubMedID 15587437
Simulation-based parameter estimation for complex models: a breast cancer natural history modelling illustration
STATISTICAL METHODS IN MEDICAL RESEARCH
2004; 13 (6): 507-524
Simulation-based parameter estimation offers a powerful means of estimating parameters in complex stochastic models. We illustrate the application of these ideas in the setting of a natural history model for breast cancer. Our model assumes that the tumor growth process follows a geometric Brownian motion; parameters are estimated from the SEER registry. Our discussion focuses on the use of simulation for computing the maximum likelihood estimator for this class of models. The analysis shows that simulation provides a straightforward means of computing such estimators for models of substantial complexity.
View details for DOI 10.1191/0962280204sm380ra
View details for Web of Science ID 000225102100006
View details for PubMedID 15587436
SPECTRAL EXTRAPOLATION OF SPATIALLY BOUNDED IMAGES
IEEE TRANSACTIONS ON MEDICAL IMAGING
1995; 14 (3): 487-497
A spectral extrapolation algorithm for spatially bounded images is presented. An image is said to be spatially bounded when it is confined to a closed region and is surrounded by a background of zeros. With prior knowledge of the spatial domain zeros, the extrapolation algorithm extends the image's spectrum beyond a known interval of low-frequency components. The result, which is referred to as the finite support solution, has space variant resolution; features near the edge of the support region are better resolved than those in the center. The resolution of the finite support solution is discussed as a function of the number of known spatial zeros and known spectral components. A regularized version of the finite support solution is included for handling the case where the known spectral components are noisy. For both the noiseless and noisy cases, the resolution of the finite support solution is measured in terms of its impulse response characteristics, and compared to the resolution of the zerofilled and Nyquist solutions. The finite support solution is superior to the zerofilled solution for both the noisy and noiseless data cases. When compared to the Nyquist solution, the finite support solution may be preferred in the noisy data case. Examples using medical image data are provided.
View details for Web of Science ID A1995RU69200009
View details for PubMedID 18215853
Characterization of Patient Specific Signaling via Augmentation of Bayesian Networks with Disease and Patient State Nodes
IEEE. 2009: 6624-6627
Characterization of patient-specific disease features at a molecular level is an important emerging field. Patients may be characterized by differences in the level and activity of relevant biomolecules in diseased cells. When high throughput, high dimensional data is available, it becomes possible to characterize differences not only in the level of the biomolecules, but also in the molecular interactions among them. We propose here a novel approach to characterize patient specific signaling, which augments high throughput single cell data with state nodes corresponding to patient and disease states, and learns a Bayesian network based on this data. Features distinguishing individual patients emerge as downstream nodes in the network. We illustrate this approach with a six phospho-protein, 30,000 cell-per-patient dataset characterizing three comparably diagnosed follicular lymphoma, and show that our approach elucidates signaling differences among them.
View details for Web of Science ID 000280543605113
View details for PubMedID 19963681
ALTERNATIVE K-SPACE SAMPLING DISTRIBUTIONS FOR MR SPECTROSCOPIC IMAGING
IEEE COMPUTER SOC. 1994: 11-14
View details for Web of Science ID A1994BC13E00003
RESOLUTION IMPROVEMENT FOR INVIVO MAGNETIC-RESONANCE SPECTROSCOPIC IMAGES
SPIE - INT SOC OPTICAL ENGINEERING. 1991: 118-127
View details for Web of Science ID A1991BT62G00014