Honors & Awards
Faculty Fellow at the Stanford Center at Peking University, SCPKU (September-October 2016)
Henri Benedictus Fellow, King Baudouin Foundation (June 2009)
Honorary Fellow, Belgian American Educational Foundation (BAEF) (June 2009)
Boards, Advisory Committees, Professional Organizations
Member, International Society for Computational Biology (ISCB) (2006 - Present)
Member, American Association for Cancer Research (AACR) (2010 - Present)
Certificate, Stanford Business School, Stanford Ignite (2012)
Ph.D, University of Leuven, Belgium, BIoinformatics (2008)
M.S., University of Leuven, Belgium, Artificial Intelligence (2004)
M.S., University College, Ghent, Belgium, Electrical Engineering/Computer Science (2003)
Current Research and Scholarly Interests
My lab focuses on biomedical data fusion: the development of machine learning methods for biomedical decision support using multi-scale biomedical data. Previously we pioneered data fusion work using Bayesian and kernel methods studying breast and ovarian cancer. Additionally, we developed computational algorithms for the identification of driver genes using multi-omics data. Furthermore, we are working on multi-scale biomedical data fusion methods, bridging the molecular using omics data, cellular using pathology data and tissue using medical imaging data.
- Translational Bioinformatics
BIOE 217, BIOMEDIN 217, CS 275 (Win)
- Translational Bioinformatics
GENE 217 (Win)
- Translational Bioinformatics Lectures
BIOMEDIN 218 (Win)
- Independent Studies (3)
- Prior Year Courses
Graduate and Fellowship Programs
Biomedical Informatics (Phd Program)
- Predicting structured metadata from unstructured metadata DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2016
Magnetic resonance image features identify glioblastoma phenotypic subtypes with distinct molecular pathway activities.
Science translational medicine
2015; 7 (303): 303ra138-?
Glioblastoma (GBM) is the most common and highly lethal primary malignant brain tumor in adults. There is a dire need for easily accessible, noninvasive biomarkers that can delineate underlying molecular activities and predict response to therapy. To this end, we sought to identify subtypes of GBM, differentiated solely by quantitative magnetic resonance (MR) imaging features, that could be used for better management of GBM patients. Quantitative image features capturing the shape, texture, and edge sharpness of each lesion were extracted from MR images of 121 single-institution patients with de novo, solitary, unilateral GBM. Three distinct phenotypic "clusters" emerged in the development cohort using consensus clustering with 10,000 iterations on these image features. These three clusters-pre-multifocal, spherical, and rim-enhancing, names reflecting their image features-were validated in an independent cohort consisting of 144 multi-institution patients with similar tumor characteristics from The Cancer Genome Atlas (TCGA). Each cluster mapped to a unique set of molecular signaling pathways using pathway activity estimates derived from the analysis of TCGA tumor copy number and gene expression data with the PARADIGM (Pathway Recognition Algorithm Using Data Integration on Genomic Models) algorithm. Distinct pathways, such as c-Kit and FOXA, were enriched in each cluster, indicating differential molecular activities as determined by the image features. Each cluster also demonstrated differential probabilities of survival, indicating prognostic importance. Our imaging method offers a noninvasive approach to stratify GBM patients and also provides unique sets of molecular signatures to inform targeted therapy and personalized treatment of GBM.
View details for DOI 10.1126/scitranslmed.aaa7582
View details for PubMedID 26333934
- Magnetic resonance image features identify glioblastoma phenotypic subtypes with distinct molecular pathway activities SCIENCE TRANSLATIONAL MEDICINE 2015; 7 (303)
MethylMix: an R package for identifying DNA methylation-driven genes
2015; 31 (11): 1839-1841
DNA methylation is an important mechanism regulating gene transcription, and its role in carcinogenesis has been extensively studied. Hyper and hypomethylation of genes is an alternative mechanism to deregulate gene expression in a wide range of diseases. At the same time, high-throughput DNA methylation assays have been developed generating vast amounts of genome wide DNA methylation measurements. Yet, few tools exist that can formally identify hypo and hypermethylated genes that are predictive of transcription and thus functionally relevant for a particular disease. To accommodate this lack of tools, we developed MethylMix, an algorithm implemented in R to identify disease specific hyper and hypomethylated genes. MethylMix is based on a beta mixture model to identify methylation states and compares them with the normal DNA methylation state. MethylMix introduces a novel metric, the 'Differential Methylation value' or DM-value defined as the difference of a methylation state with the normal methylation state. Finally, matched gene expression data are used to identify, besides differential, transcriptionally predictive methylation states by focusing on methylation changes that effect gene expression.MethylMix was implemented as an R package and is available in bioconductor.
View details for DOI 10.1093/bioinformatics/btv020
View details for Web of Science ID 000356625300021
View details for PubMedID 25609794
- Pancancer analysis of DNA methylation-driven genes using MethylMix GENOME BIOLOGY 2015; 16
CaMoDi: a new method for cancer module discovery
Identification of genomic patterns in tumors is an important problem, which would enable the community to understand and extend effective therapies across the current tissue-based tumor boundaries. With this in mind, in this work we develop a robust and fast algorithm to discover cancer driver genes using an unsupervised clustering of similarly expressed genes across cancer patients. Specifically, we introduce CaMoDi, a new method for module discovery which demonstrates superior performance across a number of computational and statistical metrics.The proposed algorithm CaMoDi demonstrates effective statistical performance compared to the state of the art, and is algorithmically simple and scalable - which makes it suitable for tissue-independent genomic characterization of individual tumors as well as groups of tumors. We perform an extensive comparative study between CaMoDi and two previously developed methods (CONEXIC and AMARETTO), across 11 individual tumors and 8 combinations of tumors from The Cancer Genome Atlas. We demonstrate that CaMoDi is able to discover modules with better average consistency and homogeneity, with similar or better adjusted R2 performance compared to CONEXIC and AMARETTO.We present a novel method for Cancer Module Discovery, CaMoDi, and demonstrate through extensive simulations on the TCGA Pan-Cancer dataset that it achieves comparable or better performance than that of CONEXIC and AMARETTO, while achieving an order-of-magnitude improvement in computational run time compared to the other methods.
View details for DOI 10.1186/1471-2164-15-S10-S8
View details for Web of Science ID 000346166900008
View details for PubMedID 25560933
- Glioblastoma Multiforme: Exploratory Radiogenomic Analysis by Using Quantitative Image Features RADIOLOGY 2014; 273 (1): 168-174
Oncogenic transformation of diverse gastrointestinal tissues in primary organoid culture
2014; 20 (7): 769-777
The application of primary organoid cultures containing epithelial and mesenchymal elements to cancer modeling holds promise for combining the accurate multilineage differentiation and physiology of in vivo systems with the facile in vitro manipulation of transformed cell lines. Here we used a single air-liquid interface culture method without modification to engineer oncogenic mutations into primary epithelial and mesenchymal organoids from mouse colon, stomach and pancreas. Pancreatic and gastric organoids exhibited dysplasia as a result of expression of Kras carrying the G12D mutation (Kras(G12D)), p53 loss or both and readily generated adenocarcinoma after in vivo transplantation. In contrast, primary colon organoids required combinatorial Apc, p53, Kras(G12D) and Smad4 mutations for progressive transformation to invasive adenocarcinoma-like histology in vitro and tumorigenicity in vivo, recapitulating multi-hit models of colorectal cancer (CRC), as compared to the more promiscuous transformation of small intestinal organoids. Colon organoid culture functionally validated the microRNA miR-483 as a dominant driver oncogene at the IGF2 (insulin-like growth factor-2) 11p15.5 CRC amplicon, inducing dysplasia in vitro and tumorigenicity in vivo. These studies demonstrate the general utility of a highly tractable primary organoid system for cancer modeling and driver oncogene validation in diverse gastrointestinal tissues.
View details for DOI 10.1038/nm.3585
View details for Web of Science ID 000338689500021
- Identification of ovarian cancer driver genes by using module network integration of multi-omics data INTERFACE FOCUS 2013; 3 (4)
Identifying master regulators of cancer and their downstream targets by integrating genomic and epigenomic features.
Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing
Vast amounts of molecular data characterizing the genome, epigenome and transcriptome are becoming available for a variety of cancers. The current challenge is to integrate these diverse layers of molecular biology information to create a more comprehensive view of key biological processes underlying cancer. We developed a biocomputational algorithm that integrates copy number, DNA methylation, and gene expression data to study master regulators of cancer and identify their targets. Our algorithm starts by generating a list of candidate driver genes based on the rationale that genes that are driven by multiple genomic events in a subset of samples are unlikely to be randomly deregulated. We then select the master regulators from the candidate driver and identify their targets by inferring the underlying regulatory network of gene expression. We applied our biocomputational algorithm to identify master regulators and their targets in glioblastoma multiforme (GBM) and serous ovarian cancer. Our results suggest that the expression of candidate drivers is more likely to be influenced by copy number variations than DNA methylation. Next, we selected the master regulators and identified their downstream targets using module networks analysis. As a proof-of-concept, we show that the GBM and ovarian cancer module networks recapitulate known processes in these cancers. In addition, we identify master regulators that have not been previously reported and suggest their likely role. In summary, focusing on genes whose expression can be explained by their genomic and epigenomic aberrations is a promising strategy to identify master regulators of cancer.
View details for PubMedID 23424118
Prognostic PET F-18-FDG Uptake Imaging Features Are Associated with Major Oncogenomic Alterations in Patients with Resected Non-Small Cell Lung Cancer
2012; 72 (15): 3725-3734
Although 2[18F]fluoro-2-deoxy-d-glucose (FDG) uptake during positron emission tomography (PET) predicts post-surgical outcome in patients with non-small cell lung cancer (NSCLC), the biologic basis for this observation is not fully understood. Here, we analyzed 25 tumors from patients with NSCLCs to identify tumor PET-FDG uptake features associated with gene expression signatures and survival. Fourteen quantitative PET imaging features describing FDG uptake were correlated with gene expression for single genes and coexpressed gene clusters (metagenes). For each FDG uptake feature, an associated metagene signature was derived, and a prognostic model was identified in an external cohort and then tested in a validation cohort of patients with NSCLC. Four of eight single genes associated with FDG uptake (LY6E, RNF149, MCM6, and FAP) were also associated with survival. The most prognostic metagene signature was associated with a multivariate FDG uptake feature [maximum standard uptake value (SUV(max)), SUV(variance), and SUV(PCA2)], each highly associated with survival in the external [HR, 5.87; confidence interval (CI), 2.49-13.8] and validation (HR, 6.12; CI, 1.08-34.8) cohorts, respectively. Cell-cycle, proliferation, death, and self-recognition pathways were altered in this radiogenomic profile. Together, our findings suggest that leveraging tumor genomics with an expanded collection of PET-FDG imaging features may enhance our understanding of FDG uptake as an imaging biomarker beyond its association with glycolysis.
View details for DOI 10.1158/0008-5472.CAN-11-3943
View details for Web of Science ID 000307354100004
View details for PubMedID 22710433
Non-Small Cell Lung Cancer: Identifying Prognostic Imaging Biomarkers by Leveraging Public Gene Expression Microarray Data-Methods and Preliminary Results
2012; 264 (2): 387-396
To identify prognostic imaging biomarkers in non-small cell lung cancer (NSCLC) by means of a radiogenomics strategy that integrates gene expression and medical images in patients for whom survival outcomes are not available by leveraging survival data in public gene expression data sets.A radiogenomics strategy for associating image features with clusters of coexpressed genes (metagenes) was defined. First, a radiogenomics correlation map is created for a pairwise association between image features and metagenes. Next, predictive models of metagenes are built in terms of image features by using sparse linear regression. Similarly, predictive models of image features are built in terms of metagenes. Finally, the prognostic significance of the predicted image features are evaluated in a public gene expression data set with survival outcomes. This radiogenomics strategy was applied to a cohort of 26 patients with NSCLC for whom gene expression and 180 image features from computed tomography (CT) and positron emission tomography (PET)/CT were available.There were 243 statistically significant pairwise correlations between image features and metagenes of NSCLC. Metagenes were predicted in terms of image features with an accuracy of 59%-83%. One hundred fourteen of 180 CT image features and the PET standardized uptake value were predicted in terms of metagenes with an accuracy of 65%-86%. When the predicted image features were mapped to a public gene expression data set with survival outcomes, tumor size, edge shape, and sharpness ranked highest for prognostic significance.This radiogenomics strategy for identifying imaging biomarkers may enable a more rapid evaluation of novel imaging modalities, thereby accelerating their translation to personalized medicine.
View details for DOI 10.1148/radiol.12111607
View details for Web of Science ID 000306660000010
View details for PubMedID 22723499
A Seven-Gene Set Associated with Chronic Hypoxia of Prognostic Importance in Hepatocellular Carcinoma
CLINICAL CANCER RESEARCH
2010; 16 (16): 4278-4288
Hepatocellular carcinomas (HCC) have an unpredictable clinical course, and molecular classification could provide better insights into prognosis and patient-directed therapy. We hypothesized that in HCC, certain microenvironmental regions exist with a characteristic gene expression related to chronic hypoxia which would induce aggressive behavior.We determined the gene expression pattern for human HepG2 liver cells under chronic hypoxia by microarray analysis. Differentially expressed genes were selected and their clinical values were assessed. In our hypothesis-driven analysis, we included available independent microarray studies of patients with HCC in one single analysis. Three microarray studies encompassing 272 patients were used as training sets to determine a minimal prognostic gene set, and one recent study of 91 patients was used for validation.Using computational methods, we identified seven genes (out of 3,592 differentially expressed under chronic hypoxia) that showed correlation with poor prognostic indicators in all three training sets (65/139/73 patients) and this was validated in a fourth data set (91 patients). Retrospectively, the seven-gene set was associated with poor survival (hazard ratio, 1.39; P = 0.007) and early recurrence (hazard ratio, 2.92; P = 0.007) in 135 patients. Moreover, using a hypoxia score based on this seven-gene set, we found that patients with a score of >0.35 (n = 42) had a median survival of 307 days, whereas patients with a score of < or =0.35 (n = 93) had a median survival of 1,602 days (P = 0.005).We identified a unique, liver-specific, seven-gene signature associated with chronic hypoxia that correlates with poor prognosis in HCCs.
View details for DOI 10.1158/1078-0432.CCR-09-3274
View details for Web of Science ID 000280830300024
View details for PubMedID 20592013
Intrinsic Gene Expression Profiles of Gliomas Are a Better Predictor of Survival than Histology
2009; 69 (23): 9065-9072
Gliomas are the most common primary brain tumors with heterogeneous morphology and variable prognosis. Treatment decisions in patients rely mainly on histologic classification and clinical parameters. However, differences between histologic subclasses and grades are subtle, and classifying gliomas is subject to a large interobserver variability. To improve current classification standards, we have performed gene expression profiling on a large cohort of glioma samples of all histologic subtypes and grades. We identified seven distinct molecular subgroups that correlate with survival. These include two favorable prognostic subgroups (median survival, >4.7 years), two with intermediate prognosis (median survival, 1-4 years), two with poor prognosis (median survival, <1 year), and one control group. The intrinsic molecular subtypes of glioma are different from histologic subgroups and correlate better to patient survival. The prognostic value of molecular subgroups was validated on five independent sample cohorts (The Cancer Genome Atlas, Repository for Molecular Brain Neoplasia Data, GSE12907, GSE4271, and Li and colleagues). The power of intrinsic subtyping is shown by its ability to identify a subset of prognostically favorable tumors within an external data set that contains only histologically confirmed glioblastomas (GBM). Specific genetic changes (epidermal growth factor receptor amplification, IDH1 mutation, and 1p/19q loss of heterozygosity) segregate in distinct molecular subgroups. We identified a subgroup with molecular features associated with secondary GBM, suggesting that different genetic changes drive gene expression profiles. Finally, we assessed response to treatment in molecular subgroups. Our data provide compelling evidence that expression profiling is a more accurate and objective method to classify gliomas than histologic classification. Molecular classification therefore may aid diagnosis and can guide clinical decision making.
View details for DOI 10.1158/0008-5472.CAN-09-2307
View details for Web of Science ID 000272362800029
View details for PubMedID 19920198
Recurrent Copy Number Alterations in BRCA1-Mutated Ovarian Tumors Alter Biological Pathways
2009; 30 (12): 1693-1702
Array CGH was used to identify recurrent copy number alterations (RCNA) characteristic of either BRCA1-related or sporadic ovarian cancer. After preprocessing, both groups of patients were modeled using a recurrent Hidden Markov Model to detect RCNA. RCNA with a probability higher than 80% were called. After removing RCNA present in both groups, the genes present in the remaining RCNA were investigated for enrichment of pathways from external databases. More RCNA were observed in the BRCA1 group, and they display more losses than gains compared to the sporadic group. When focusing on the type of RCNA, no significant difference in length was seen for the gains, but there was a statistically significant difference for the losses. In the sporadic group, a great proportion of the altered regions contain genes known to have a function in cell adhesion and complement activation, whereas the BRCA1 samples are characterized by alterations in the HOX genes, metalloproteinases, tumor suppressor genes, and the estrogen-signaling pathways. We conclude that BRCA1 ovarian tumors present a different type, number, and length of RCNA; a huge amount of the genome is lost, resulting in important genomic instability. Moreover, important biological pathways are altered differentially when compared to the sporadic group.
View details for DOI 10.1002/humu.21135
View details for Web of Science ID 000272796400011
View details for PubMedID 19802895
CoINcIDE: A framework for discovery of patient subtypes across multiple datasets
Patient disease subtypes have the potential to transform personalized medicine. However, many patient subtypes derived from unsupervised clustering analyses on high-dimensional datasets are not replicable across multiple datasets, limiting their clinical utility. We present CoINcIDE, a novel methodological framework for the discovery of patient subtypes across multiple datasets that requires no between-dataset transformations. We also present a high-quality database collection, curatedBreastData, with over 2,500 breast cancer gene expression samples. We use CoINcIDE to discover novel breast and ovarian cancer subtypes with prognostic significance and novel hypothesized ovarian therapeutic targets across multiple datasets. CoINcIDE and curatedBreastData are available as R packages.
View details for DOI 10.1186/s13073-016-0281-4
View details for Web of Science ID 000371588100001
View details for PubMedID 26961683
COmbined Mapping of Multiple clUsteriNg ALgorithms (COMMUNAL): A Robust Method for Selection of Cluster Number, K
In order to discover new subsets (clusters) of a data set, researchers often use algorithms that perform unsupervised clustering, namely, the algorithmic separation of a dataset into some number of distinct clusters. Deciding whether a particular separation (or number of clusters, K) is correct is a sort of 'dark art', with multiple techniques available for assessing the validity of unsupervised clustering algorithms. Here, we present a new technique for unsupervised clustering that uses multiple clustering algorithms, multiple validity metrics, and progressively bigger subsets of the data to produce an intuitive 3D map of cluster stability that can help determine the optimal number of clusters in a data set, a technique we call COmbined Mapping of Multiple clUsteriNg ALgorithms (COMMUNAL). COMMUNAL locally optimizes algorithms and validity measures for the data being used. We show its application to simulated data with a known K, and then apply this technique to several well-known cancer gene expression datasets, showing that COMMUNAL provides new insights into clustering behavior and stability in all tested cases. COMMUNAL is shown to be a useful tool for determining K in complex biological datasets, and is freely available as a package for R.
View details for DOI 10.1038/srep16971
View details for Web of Science ID 000364936000001
View details for PubMedID 26581809
The center for expanded data annotation and retrieval.
Journal of the American Medical Informatics Association
2015; 22 (6): 1148-1152
The Center for Expanded Data Annotation and Retrieval is studying the creation of comprehensive and expressive metadata for biomedical datasets to facilitate data discovery, data interpretation, and data reuse. We take advantage of emerging community-based standard templates for describing different kinds of biomedical datasets, and we investigate the use of computational techniques to help investigators to assemble templates and to fill in their values. We are creating a repository of metadata from which we plan to identify metadata patterns that will drive predictive data entry when filling in metadata templates. The metadata repository not only will capture annotations specified when experimental datasets are initially created, but also will incorporate links to the published literature, including secondary analyses and possible refinements or retractions of experimental interpretations. By working initially with the Human Immunology Project Consortium and the developers of the ImmPort data repository, we are developing and evaluating an end-to-end solution to the problems of metadata authoring and management that will generalize to other data-management environments.
View details for DOI 10.1093/jamia/ocv048
View details for PubMedID 26112029
Core samples for radiomics features that are insensitive to tumor segmentation: method and pilot study using CT images of hepatocellular carcinoma.
Journal of medical imaging (Bellingham, Wash.)
2015; 2 (4): 041011-?
The purpose of this study is to investigate the utility of obtaining "core samples" of regions in CT volume scans for extraction of radiomic features. We asked four readers to outline tumors in three representative slices from each phase of multiphasic liver CT images taken from 29 patients (1128 segmentations) with hepatocellular carcinoma. Core samples were obtained by automatically tracing the maximal circle inscribed in the outlines. Image features describing the intensity, texture, shape, and margin were used to describe the segmented lesion. We calculated the intraclass correlation between the features extracted from the readers' segmentations and their core samples to characterize robustness to segmentation between readers, and between human-based segmentation and core sampling. We conclude that despite the high interreader variability in manually delineating the tumor (average overlap of 43% across all readers), certain features such as intensity and texture features are robust to segmentation. More importantly, this same subset of features can be obtained from the core samples, providing as much information as detailed segmentation while being simpler and faster to obtain.
View details for DOI 10.1117/1.JMI.2.4.041011
View details for PubMedID 26587549
Addition of MR imaging features and genetic biomarkers strengthens glioblastoma survival prediction in TCGA patients
JOURNAL OF NEURORADIOLOGY
2015; 42 (4): 212-221
The purpose of our study was to assess whether a model combining clinical factors, MR imaging features, and genomics would better predict overall survival of patients with glioblastoma (GBM) than either individual data type.The study was conducted leveraging The Cancer Genome Atlas (TCGA) effort supported by the National Institutes of Health. Six neuroradiologists reviewed MRI images from The Cancer Imaging Archive (http://cancerimagingarchive.net) of 102 GBM patients using the VASARI scoring system. The patients' clinical and genetic data were obtained from the TCGA website (http://www.cancergenome.nih.gov/). Patient outcome was measured in terms of overall survival time. The association between different categories of biomarkers and survival was evaluated using Cox analysis.The features that were significantly associated with survival were: (1) clinical factors: chemotherapy; (2) imaging: proportion of tumor contrast enhancement on MRI; and (3) genomics: HRAS copy number variation. The combination of these three biomarkers resulted in an incremental increase in the strength of prediction of survival, with the model that included clinical, imaging, and genetic variables having the highest predictive accuracy (area under the curve 0.679±0.068, Akaike's information criterion 566.7, P<0.001).A combination of clinical factors, imaging features, and HRAS copy number variation best predicts survival of patients with GBM.
View details for DOI 10.1016/j.neurad.2014.02.006
View details for Web of Science ID 000358109400004
DNA Methylation-Guided Prediction of Clinical Failure in High-Risk Prostate Cancer
2015; 10 (6)
Prostate cancer (PCa) is a very heterogeneous disease with respect to clinical outcome. This study explored differential DNA methylation in a priori selected genes to diagnose PCa and predict clinical failure (CF) in high-risk patients.A quantitative multiplex, methylation-specific PCR assay was developed to assess promoter methylation of the APC, CCND2, GSTP1, PTGS2 and RARB genes in formalin-fixed, paraffin-embedded tissue samples from 42 patients with benign prostatic hyperplasia and radical prostatectomy specimens of patients with high-risk PCa, encompassing training and validation cohorts of 147 and 71 patients, respectively. Log-rank tests, univariate and multivariate Cox models were used to investigate the prognostic value of the DNA methylation.Hypermethylation of APC, CCND2, GSTP1, PTGS2 and RARB was highly cancer-specific. However, only GSTP1 methylation was significantly associated with CF in both independent high-risk PCa cohorts. Importantly, trichotomization into low, moderate and high GSTP1 methylation level subgroups was highly predictive for CF. Patients with either a low or high GSTP1 methylation level, as compared to the moderate methylation groups, were at a higher risk for CF in both the training (Hazard ratio [HR], 3.65; 95% CI, 1.65 to 8.07) and validation sets (HR, 4.27; 95% CI, 1.03 to 17.72) as well as in the combined cohort (HR, 2.74; 95% CI, 1.42 to 5.27) in multivariate analysis.Classification of primary high-risk tumors into three subtypes based on DNA methylation can be combined with clinico-pathological parameters for a more informative risk-stratification of these PCa patients.
View details for DOI 10.1371/journal.pone.0130651
View details for Web of Science ID 000356567500126
View details for PubMedID 26086362
Combining bevacizumab and chemoradiation in rectal cancer. Translational results of the AXEBeam trial.
British journal of cancer
2015; 112 (8): 1314-1325
This study characterises molecular effect of bevacizumab, and explores the relation of molecular and genetic markers with response to bevacizumab combined with chemoradiotherapy (CRT).From a subset of 59 patients of 84 rectal cancer patients included in a phase II study combining bevacizumab with CRT, tumour and blood samples were collected before and during treatment, offering the possibility to evaluate changes induced by one dose of bevacizumab. We performed cDNA microarrays, stains for CD31/CD34 combined with α-SMA and CA-IX, as well as enzyme-linked immunosorbent assay (ELISA) for circulating angiogenic proteins. Markers were related with the pathological response of patients.One dose of bevacizumab changed the expression of 14 genes and led to a significant decrease in microvessel density and in the proportion of pericyte-covered blood vessels, and a small but nonsignificant increase in hypoxia. Alterations in angiogenic processes after bevacizumab delivery were only detected in responding tumours. Lower PDGFA expression and PDGF-BB levels, less pericyte-covered blood vessels and higher CA-IX expression were found after bevacizumab treatment only in patients with pathological complete response.We could not support the 'normalization hypothesis' and suggest a role for PDGFA, PDGF-BB, CA-IX and α-SMA. Validation in larger patient groups is needed.
View details for DOI 10.1038/bjc.2015.93
View details for PubMedID 25867261
- Methylation of PITX2, HOXD3, RASSF1 and TDRD1 predicts biochemical recurrence in high-risk prostate cancer JOURNAL OF CANCER RESEARCH AND CLINICAL ONCOLOGY 2014; 140 (11): 1849-1861
NF-kappa B protein expression associates with F-18-FDG PET tumor uptake in non-small cell lung cancer: A radiogenomics validation study to understand tumor metabolism
2014; 83 (2): 189-196
We previously demonstrated that NF-κB may be associated with (18)F-FDG PET uptake and patient prognosis using radiogenomics in patients with non-small cell lung cancer (NSCLC). To validate these results, we assessed NF-κB protein expression in an extended cohort of NSCLC patients.We examined NF-κBp65 by immunohistochemistry (IHC) using a Tissue Microarray. Staining intensity was assessed by qualitative ordinal scoring and compared to tumor FDG uptake (SUVmax and SUVmean), lactate dehydrogenase A (LDHA) expression (as a positive control) and outcome using ANOVA, Kaplan Meier (KM), and Cox-proportional hazards (CPH) analysis.365 tumors from 355 patients with long-term follow-up were analyzed. The average age for patients was 67±11 years, 46% were male and 67% were ever smokers. Stage I and II patients comprised 83% of the cohort and the majority had adenocarcinoma (73%). From 88 FDG PET scans available, average SUVmax and SUVmean were 8.3±6.6, and 3.7±2.4 respectively. Increasing NF-κBp65 expression, but not LDHA expression, was associated with higher SUVmax and SUVmean (p=0.03 and 0.02 respectively). Both NF-κBp65 and positive FDG uptake were significantly associated with more advanced stage, tumor histology and invasion. Higher NF-κBp65 expression was associated with death by KM analysis (p=0.06) while LDHA was strongly associated with recurrence (p=0.04). Increased levels of combined NF-κBp65 and LDHA expression were synergistic and associated with both recurrence (p=0.04) and death (p=0.03).NF-κB IHC was a modest biomarker of prognosis that associated with tumor glucose metabolism on FDG PET when compared to existing molecular correlates like LDHA, which was synergistic with NF-κB for outcome. These findings recapitulate radiogenomics profiles previously reported by our group and provide a methodology for studying tumor biology using computational approaches.
View details for DOI 10.1016/j.lungcan.2013.11.001
View details for Web of Science ID 000331495000011
View details for PubMedID 24355259
- Stromal architecture and periductal decorin are potential prognostic markers for ipsilateral locoregional recurrence in ductal carcinoma in situ of the breast HISTOPATHOLOGY 2013; 63 (4): 520-533
Cross-Species Functional Analysis of Cancer-Associated Fibroblasts Identifies a Critical Role for CLCF1 and IL-6 in Non-Small Cell Lung Cancer In Vivo
2012; 72 (22): 5744-5756
Cancer-associated fibroblasts (CAF) have been reported to support tumor progression by a variety of mechanisms. However, their role in the progression of non-small cell lung cancer (NSCLC) remains poorly defined. In addition, the extent to which specific proteins secreted by CAFs contribute directly to tumor growth is unclear. To study the role of CAFs in NSCLCs, a cross-species functional characterization of mouse and human lung CAFs was conducted. CAFs supported the growth of lung cancer cells in vivo by secretion of soluble factors that directly stimulate the growth of tumor cells. Gene expression analysis comparing normal mouse lung fibroblasts and mouse lung CAFs identified multiple genes that correlate with the CAF phenotype. A gene signature of secreted genes upregulated in CAFs was an independent marker of poor survival in patients with NSCLC. This secreted gene signature was upregulated in normal lung fibroblasts after long-term exposure to tumor cells, showing that lung fibroblasts are "educated" by tumor cells to acquire a CAF-like phenotype. Functional studies identified important roles for CLCF1-CNTFR and interleukin (IL)-6-IL-6R signaling in promoting growth of NSCLCs. This study identifies novel soluble factors contributing to the CAF protumorigenic phenotype in NSCLCs and suggests new avenues for the development of therapeutic strategies.
View details for DOI 10.1158/0008-5472.CAN-12-1097
View details for Web of Science ID 000311141300012
View details for PubMedID 22962265
Evaluation of a panel of 28 biomarkers for the non-invasive diagnosis of endometriosis
2012; 27 (9): 2698-2711
At present, the only way to conclusively diagnose endometriosis is laparoscopic inspection, preferably with histological confirmation. This contributes to the delay in the diagnosis of endometriosis which is 6-11 years. So far non-invasive diagnostic approaches such as ultrasound (US), MRI or blood tests do not have sufficient diagnostic power. Our aim was to develop and validate a non-invasive diagnostic test with a high sensitivity (80% or more) for symptomatic endometriosis patients, without US evidence of endometriosis, since this is the group most in need of a non-invasive test.A total of 28 inflammatory and non-inflammatory plasma biomarkers were measured in 353 EDTA plasma samples collected at surgery from 121 controls without endometriosis at laparoscopy and from 232 women with endometriosis (minimal-mild n = 148; moderate-severe n = 84), including 175 women without preoperative US evidence of endometriosis. Surgery was done during menstrual (n = 83), follicular (n = 135) and luteal (n = 135) phases of the menstrual cycle. For analysis, the data were randomly divided into an independent training (n = 235) and a test (n = 118) data set. Statistical analysis was done using univariate and multivariate (logistic regression and least squares support vector machines (LS-SVM) approaches in training- and test data set separately to validate our findings.In the training set, two models of four biomarkers (Model 1: annexin V, VEGF, CA-125 and glycodelin; Model 2: annexin V, VEGF, CA-125 and sICAM-1) analysed in plasma, obtained during the menstrual phase, could predict US-negative endometriosis with a high sensitivity (81-90%) and an acceptable specificity (68-81%). The same two models predicted US-negative endometriosis in the independent validation test set with a high sensitivity (82%) and an acceptable specificity (63-75%).In plasma samples obtained during menstruation, multivariate analysis of four biomarkers (annexin V, VEGF, CA-125 and sICAM-1/or glycodelin) enabled the diagnosis of endometriosis undetectable by US with a sensitivity of 81-90% and a specificity of 63-81% in independent training- and test data set. The next step is to apply these models for preoperative prediction of endometriosis in an independent set of patients with infertility and/or pain without US evidence of endometriosis, scheduled for laparoscopy.
View details for DOI 10.1093/humrep/des234
View details for Web of Science ID 000307502000016
View details for PubMedID 22736326
Combined mRNA microarray and proteomic analysis of eutopic endometrium of women with and without endometriosis.
2012; 27 (7): 2020-2029
An early semi-invasive diagnosis of endometriosis has the potential to allow early treatment and minimize disease progression but no such test is available at present. Our aim was to perform a combined mRNA microarray and proteomic analysis on the same eutopic endometrium sample obtained from patients with and without endometriosis.mRNA and protein fractions were extracted from 49 endometrial biopsies obtained from women with laparoscopically proven presence (n= 31) or absence (n= 18) of endometriosis during the early luteal (n= 27) or menstrual phase (n= 22) and analyzed using microarray and proteomic surface enhanced laser desorption ionization-time of flight mass spectrometry, respectively. Proteomic data were analyzed using a least squares-support vector machines (LS-SVM) model built on 70% (training set) and 30% of the samples (test set).mRNA analysis of eutopic endometrium did not show any differentially expressed genes in women with endometriosis when compared with controls, regardless of endometriosis stage or cycle phase. mRNA was differentially expressed (P< 0.05) in women with (925 genes) and without endometriosis (1087 genes) during the menstrual phase when compared with the early luteal phase. Proteomic analysis based on five peptide peaks [2072 mass/charge (m/z); 2973 m/z; 3623 m/z; 3680 m/z and 21133 m/z] using an LS-SVM model applied on the luteal phase endometrium training set allowed the diagnosis of endometriosis (sensitivity, 91; 95% confidence interval (CI): 74-98; specificity, 80; 95% CI: 66-97 and positive predictive value, 87.9%; negative predictive value, 84.8%) in the test set.mRNA expression of eutopic endometrium was comparable in women with and without endometriosis but different in menstrual endometrium when compared with luteal endometrium in women with endometriosis. Proteomic analysis of luteal phase endometrium allowed the diagnosis of endometriosis with high sensitivity and specificity in training and test sets. A potential limitation of our study is the fact that our control group included women with a normal pelvis as well as women with concurrent pelvic disease (e.g. fibroids, benign ovarian cysts, hydrosalpinges), which may have contributed to the comparable mRNA expression profile in the eutopic endometrium of women with endometriosis and controls.
View details for DOI 10.1093/humrep/des127
View details for PubMedID 22556377
Proteomics Analysis of Plasma for Early Diagnosis of Endometriosis
OBSTETRICS AND GYNECOLOGY
2012; 119 (2): 276-285
To test the hypothesis that differential surface-enhanced laser desorption/ionization time-of-flight mass spectrometry protein or peptide expression in plasma can be used in infertile women with or without pelvic pain to predict the presence of laparoscopically and histologically confirmed endometriosis, especially in the subpopulation with a normal preoperative gynecologic ultrasound examination.Surface-enhanced laser desorption/ionization time-of-flight mass spectrometry analysis was performed on 254 plasma samples obtained from 89 women without endometriosis and 165 women with endometriosis (histologically confirmed) undergoing laparoscopies for infertility with or without pelvic pain. Data were analyzed using least squares support vector machines and were divided randomly (100 times) into a training data set (70%) and a test data set (30%).Minimal-to-mild endometriosis was best predicted (sensitivity 75%, 95% confidence interval [CI] 63-89; specificity 86%, 95% CI 71-94; positive predictive value 83.6%, negative predictive value 78.3%) using a model based on five peptide and protein peaks (range 4.898-14.698 m/z) in menstrual phase samples. Moderate-to-severe endometriosis was best predicted (sensitivity 98%, 95% CI 84-100; specificity 81%, 95% CI 67-92; positive predictive value 74.4%, negative predictive value 98.6%) using a model based on five other peptide and protein peaks (range 2.189-7.457 m/z) in luteal phase samples. The peak with the highest intensity (2.189 m/z) was identified as a fibrinogen ?-chain peptide. Ultrasonography-negative endometriosis was best predicted (sensitivity 88%, 95% CI 73-100; specificity 84%, 95% CI 71-96) using a model based on five peptide peaks (range 2.058-42.065 m/z) in menstrual phase samples.A noninvasive test using proteomic analysis of plasma samples obtained during the menstrual phase enabled the diagnosis of endometriosis undetectable by ultrasonography with high sensitivity and specificity.II.
View details for DOI 10.1097/AOG.0b013e31823fda8d
View details for Web of Science ID 000299604300012
View details for PubMedID 22270279
Atypical Neurofibromas in Neurofibromatosis Type 1 are Premalignant Tumors
GENES CHROMOSOMES & CANCER
2011; 50 (12): 1021-1032
Benign peripheral nerve sheath tumors (PNSTs) are a characteristic feature of neurofibromatosis type I (NF1) patients. NF1 individuals have an 8-13% lifetime risk of developing a malignant PNST (MPNST). Atypical neurofibromas are symptomatic, hypercellular PNSTs, composed of cells with hyperchromatic nuclei in the absence of mitoses. Little is known about the origin and nature of atypical neurofibromas in NF1 patients. In this study, we classified the atypical neurofibromas in the spectrum of NF1-associated PNSTs by analyzing 65 tumor samples from 48 NF1 patients. We compared tumor-specific chromosomal copy number alterations between benign neurofibromas, atypical neurofibromas, and MPNSTs (low-, intermediate-, and high-grade) by karyotyping and microarray-based comparative genome hybridization (aCGH). In 15 benign neurofibromas (4 subcutaneous and 11 plexiform), no copy number alterations were found, except a single event in a plexiform neurofibroma. One highly significant recurrent aberration (15/16) was identified in the atypical neurofibromas, namely a deletion with a minimal overlapping region (MOR) in chromosome band 9p21.3, including CDKN2A and CDKN2B. Copy number loss of the CDKN2A/B gene locus was one of the most common events in the group of MPNSTs, with deletions in low-, intermediate-, and high-grade MPNSTs. In one tumor, we observed a clear transition from a benign-atypical neurofibroma toward an intermediate-grade MPNST, confirmed by both histopathology and aCGH analysis. These data support the hypothesis that atypical neurofibromas are premalignant tumors, with the CDKN2A/B deletion as the first step in the progression toward MPNST.
View details for DOI 10.1002/gcc.20921
View details for Web of Science ID 000296443600005
View details for PubMedID 21987445
Prediction of lymph node involvement in breast cancer from primary tumor tissue using gene expression profiling and miRNAs
BREAST CANCER RESEARCH AND TREATMENT
2011; 129 (3): 767-776
The aim of this study was to investigate whether lymph node involvement in breast cancer is influenced by gene or miRNA expression of the primary tumor. For this purpose, we selected a very homogeneous patient population to minimize heterogeneity in other tumor and patient characteristics. First, we compared gene expression profiles of primary tumor tissue from a group of 96 breast cancer patients balanced for lymph node involvement using Affymetrix Human U133 Plus 2.0 microarray chip. A model was built by weighted Least-Squares Support Vector Machines and validated on an internal and external dataset. Next, miRNA profiling was performed on a subset of 82 tumors using Human MiRNA-microarray chips (Illumina). Finally, for each miRNA the number of significant inverse correlated targets was determined and compared with 1000 sets of randomly chosen targets. A model based on 241 genes was built (AUC 0.66). The AUC for the internal dataset was 0.646 and 0. 651 for the external datasets. The model includes multiple kinases, apoptosis-related, and zinc ion-binding genes. Integration of the microarray and miRNA data reveals ten miRNAs suppressing lymph node invasion and one miRNA promoting lymph node invasion. Our results provide evidence that measurable differences in gene and miRNA expression exist between node negative and node positive patients and thus that lymph node involvement is not a genetically random process. Moreover, our data suggest a general deregulation of the miRNA machinery that is potentially responsible for lymph node invasion.
View details for DOI 10.1007/s10549-010-1265-5
View details for Web of Science ID 000294680600010
View details for PubMedID 21116709
Ectopic pregnancy: using the hCG ratio to select women for expectant or medical management
ACTA OBSTETRICIA ET GYNECOLOGICA SCANDINAVICA
2011; 90 (3): 264-272
To identify variables that can be used to select women with an ectopic pregnancy for expectant or medical management with systemic methotrexate.Cohort study.Early Pregnancy Unit of a London teaching hospital.Women with a tubal ectopic pregnancy managed non-surgically.The diagnosis of tubal ectopic pregnancy was made using transvaginal sonography. Human chorionic gonadotrophin (hCG) levels had to be taken at 0 hour and 48 hours pre-treatment. Other recorded variables include presenting complaints, gestational age, progesterone levels, size of the ectopic mass and appearance of the ectopic on transvaginal sonography. Women were followed up until the outcome (success or failure) of management was known.Univariable analysis was performed to identify the variables associated with successful management using area under curves and relative risks.Thirty-nine women underwent expectant management (overall success rate 71.8%) and 42 had medical management (overall success rate 76.2%). The pre-treatment hCG ratio (hCG 48 hours/hCG 0 hour) was related to the failure of both expectant (area under curve 0.86, 95% CI 0.67-0.94) and medical (area under curve 0.79, 95% CI 0.58-0.90) management. History of ectopic pregnancy was related to failure of expectant management only (relative risk 0.46, 95% CI 0.16-0.92).The most important variable for predicting the likelihood of successful non-surgical management was the pre-treatment hCG ratio. New studies are required to validate the use of this variable and of history of ectopic pregnancy to predict the likelihood of successful non-surgical management in clinical practice.
View details for DOI 10.1111/j.1600-0412.2010.01053.x
View details for Web of Science ID 000288825600010
View details for PubMedID 21306315
Evaluation of endometrial biomarkers for semi-invasive diagnosis of endometriosis
FERTILITY AND STERILITY
2011; 95 (4): 1338-U173
To test the hypothesis that specific proteins and peptides are expressed differentially in eutopic endometrium of women with and without endometriosis and at specific stages of the disease (minimal, mild, moderate, or severe) during the secretory phase.Patients with endometriosis were compared with controls.University hospital.A total of 29 patients during the secretory phase were selected for this study on the basis of cycle phase and presence or absence of endometriosis.Endometriosis was confirmed laparoscopically and histologically in 19 patients with endometriosis of revised American Society for Reproductive Medicine stages (9 minimal-mild and 10 moderate-severe), and the presence of a normal pelvis was documented by laparoscopy in 10 controls.Protein expression of endometrium was evaluated with use of surface-enhanced laser desorption/ionization time-of-flight mass spectrometry. The differential expression of protein mass peaks was analyzed with use of support vector machine algorithms and logistic regression models.Data preprocessing resulted in differential expression of 73, 30, and 131 mass peaks between controls and patients with endometriosis (all stages), with minimal-mild endometriosis, and with moderate-severe endometriosis, respectively. Endometriosis was diagnosed with high sensitivity (89.5%) and specificity (90%) with use of five down-regulated mass peaks (1.949 kDa, 5.183 kDa, 8.650 kDa, 8.659 kDa, and 13.910 kDa) obtained after support vector machine ranking and logistic regression classification. With use of a similar analysis, minimal-mild endometriosis was diagnosed with four mass peaks (two up-regulated: 35.956 kDa and 90.675 kDa and two down-regulated: 1.924 kDa and 2.504 kDa) with maximal sensitivity (100%) and specificity (100%). The 90.675-kDa and 35.956-kDa mass peaks were identified as T-plastin and annexin V, respectively.Surface-enhanced laser desorption/ionization time-of-flight mass spectrometry analysis of secretory phase endometrium combined with bioinformatics puts forward a prospective panel of potential biomarkers with sensitivity of 100% and specificity of 100% for the diagnosis of minimal to mild endometriosis.
View details for DOI 10.1016/j.fertnstert.2010.06.084
View details for Web of Science ID 000288010900024
View details for PubMedID 20800833
TRIzol treatment of secretory phase endometrium allows combined proteomic and mRNA microarray analysis of the same sample in women with and without endometriosis
REPRODUCTIVE BIOLOGY AND ENDOCRINOLOGY
According to mRNA microarray, proteomics and other studies, biological abnormalities of eutopic endometrium (EM) are involved in the pathogenesis of endometriosis, but the relationship between mRNA and protein expression in EM is not clear. We tested for the first time the hypothesis that EM TRIzol extraction allows proteomic Surface Enhanced Laser Desorption/Ionisation Time-of-Flight Mass Spectrometry (SELDI-TOF MS) analysis and that these proteomic data can be related to mRNA (microarray) data obtained from the same EM sample from women with and without endometriosis.Proteomic analysis was performed using SELDI-TOF-MS of TRIzol-extracted EM obtained during secretory phase from patients without endometriosis (n = 6), patients with minimal-mild (n = 5) and with moderate-severe endometriosis (n = 5), classified according to the system of the American Society of Reproductive Medicine. Proteomic data were compared to mRNA microarray data obtained from the same EM samples.In our SELDI-TOF MS study 32 peaks were differentially expressed in endometrium of all women with endometriosis (stages I-IV) compared with all controls during the secretory phase. Comparison of proteomic results with those from microarray revealed no corresponding genes/proteins.TRIzol treatment of secretory phase EM allows combined proteomic and mRNA microarray analysis of the same sample, but comparison between proteomic and microarray data was not evident, probably due to post-translational modifications.
View details for DOI 10.1186/1477-7827-8-123
View details for Web of Science ID 000284485100001
View details for PubMedID 20964823
Improved Microarray-Based Decision Support with Graph Encoded Interactome Data
2010; 5 (4)
In the past, microarray studies have been criticized due to noise and the limited overlap between gene signatures. Prior biological knowledge should therefore be incorporated as side information in models based on gene expression data to improve the accuracy of diagnosis and prognosis in cancer. As prior knowledge, we investigated interaction and pathway information from the human interactome on different aspects of biological systems. By exploiting the properties of kernel methods, relations between genes with similar functions but active in alternative pathways could be incorporated in a support vector machine classifier based on spectral graph theory. Using 10 microarray data sets, we first reduced the number of data sources relevant for multiple cancer types and outcomes. Three sources on metabolic pathway information (KEGG), protein-protein interactions (OPHID) and miRNA-gene targeting (microRNA.org) outperformed the other sources with regard to the considered class of models. Both fixed and adaptive approaches were subsequently considered to combine the three corresponding classifiers. Averaging the predictions of these classifiers performed best and was significantly better than the model based on microarray data only. These results were confirmed on 6 validation microarray sets, with a significantly improved performance in 4 of them. Integrating interactome data thus improves classification of cancer outcome for the investigated microarray technologies and cancer types. Moreover, this strategy can be incorporated in any kernel method or non-linear version of a non-kernel method.
View details for DOI 10.1371/journal.pone.0010225
View details for Web of Science ID 000276853800015
View details for PubMedID 20419106
Non-invasive diagnosis of endometriosis based on a combined analysis of six plasma biomarkers
2010; 25 (3): 654-664
Lack of a non-invasive diagnostic test contributes to the long delay between onset of symptoms and diagnosis of endometriosis. The aim of this study was to evaluate the combined performance of six potential plasma biomarkers in the diagnosis of endometriosis.This case-control study was conducted in 294 infertile women, consisting of 93 women with a normal pelvis and 201 women with endometriosis. We measured plasma concentrations of interleukin (IL)-6, IL-8, tumour necrosis factor-alpha, high-sensitivity C-reactive protein (hsCRP), and cancer antigens CA-125 and CA-19-9. Analyses were done using the Kruskal-Wallis test, Mann-Whitney test, receiver operator characteristic, stepwise logistic regression and least squares support vector machines (LSSVM).Plasma levels of IL-6, IL-8 and CA-125 were increased in all women with endometriosis and in those with minimal-mild endometriosis, compared with controls. In women with moderate-severe endometriosis, plasma levels of IL-6, IL-8 and CA-125, but also of hsCRP, were significantly higher than in controls. Using stepwise logistic regression, moderate-severe endometriosis was diagnosed with a sensitivity of 100% (specificity 84%) and minimal-mild endometriosis was detected with a sensitivity of 87% (specificity 71%) during the secretory phase. Using LSSVM analysis, minimal-mild endometriosis was diagnosed with a sensitivity of 94% (specificity 61%) during the secretory phase and with a sensitivity of 92% (specificity 63%) during the menstrual phase.Advanced statistical analysis of a panel of six selected plasma biomarkers on samples obtained during the secretory phase or during menstruation allows the diagnosis of both minimal-mild and moderate-severe endometriosis with high sensitivity and clinically acceptable specificity.
View details for DOI 10.1093/humrep/dep425
View details for Web of Science ID 000274490700014
View details for PubMedID 20007161
A taxonomy of epithelial human cancer and their metastases
BMC MEDICAL GENOMICS
Microarray technology has allowed to molecularly characterize many different cancer sites. This technology has the potential to individualize therapy and to discover new drug targets. However, due to technological differences and issues in standardized sample collection no study has evaluated the molecular profile of epithelial human cancer in a large number of samples and tissues. Additionally, it has not yet been extensively investigated whether metastases resemble their tissue of origin or tissue of destination.We studied the expression profiles of a series of 1566 primary and 178 metastases by unsupervised hierarchical clustering. The clustering profile was subsequently investigated and correlated with clinico-pathological data. Statistical enrichment of clinico-pathological annotations of groups of samples was investigated using Fisher exact test. Gene set enrichment analysis (GSEA) and DAVID functional enrichment analysis were used to investigate the molecular pathways. Kaplan-Meier survival analysis and log-rank tests were used to investigate prognostic significance of gene signatures.Large clusters corresponding to breast, gastrointestinal, ovarian and kidney primary tissues emerged from the data. Chromophobe renal cell carcinoma clustered together with follicular differentiated thyroid carcinoma, which supports recent morphological descriptions of thyroid follicular carcinoma-like tumors in the kidney and suggests that they represent a subtype of chromophobe carcinoma. We also found an expression signature identifying primary tumors of squamous cell histology in multiple tissues. Next, a subset of ovarian tumors enriched with endometrioid histology clustered together with endometrium tumors, confirming that they share their etiopathogenesis, which strongly differs from serous ovarian tumors. In addition, the clustering of colon and breast tumors correlated with clinico-pathological characteristics. Moreover, a signature was developed based on our unsupervised clustering of breast tumors and this was predictive for disease-specific survival in three independent studies. Next, the metastases from ovarian, breast, lung and vulva cluster with their tissue of origin while metastases from colon showed a bimodal distribution. A significant part clusters with tissue of origin while the remaining tumors cluster with the tissue of destination.Our molecular taxonomy of epithelial human cancer indicates surprising correlations over tissues. This may have a significant impact on the classification of many cancer sites and may guide pathologists, both in research and daily practice. Moreover, these results based on unsupervised analysis yielded a signature predictive of clinical outcome in breast cancer. Additionally, we hypothesize that metastases from gastrointestinal origin either remember their tissue of origin or adapt to the tissue of destination. More specifically, colon metastases in the liver show strong evidence for such a bimodal tissue specific profile.
View details for DOI 10.1186/1755-8794-2-69
View details for Web of Science ID 000273595600001
View details for PubMedID 20017941
Density of small diameter sensory nerve fibres in endometrium: a semi-invasive diagnostic test for minimal to mild endometriosis
2009; 24 (12): 3025-3032
The aim of our study was to test the hypothesis that multiple-sensory small-diameter nerve fibres are present in a higher density in endometrium from patients with endometriosis when compared with women with a normal pelvis, enabling the development of a semi-invasive diagnostic test for minimal-mild endometriosis.Secretory phase endometrium samples (n = 40), obtained from women with laparoscopically/histologically confirmed minimal-mild endometriosis (n = 20) and from women with a normal pelvis (n = 20) were selected from the biobank at the Leuven University Fertility Centre. Immunohistochemistry was performed to localize neural markers for sensory C, Adelta, adrenergic and cholinergic nerve fibres in the functional layer of the endometrium. Sections were immunostained with anti-human protein gene product 9.5 (PGP9.5), anti-neurofilament protein, anti-substance P (SP), anti-vasoactive intestinal peptide (VIP), anti-neuropeptide Y and anti-calcitonine gene-related polypeptide. Statistical analysis was done using the Mann-Whitney U-test, receiver operator characteristic analysis, stepwise logistic regression and least-squares support vector machines.The density of small nerve fibres was approximately 14 times higher in endometrium from patients with minimal-mild endometriosis (1.96 +/- 2.73) when compared with women with a normal pelvis (0.14 +/- 0.46, P < 0.0001).The combined analysis of neural markers PGP9.5, VIP and SP could predict the presence of minimal-mild endometriosis with 95% sensitivity, 100% specificity and 97.5% accuracy. To confirm our findings, prospective studies are required.
View details for DOI 10.1093/humrep/dep283
View details for Web of Science ID 000272069500009
View details for PubMedID 19690351
Molecular Response to Cetuximab and Efficacy of Preoperative Cetuximab-Based Chemoradiation in Rectal Cancer
JOURNAL OF CLINICAL ONCOLOGY
2009; 27 (17): 2751-2757
To characterize the molecular pathways activated or inhibited by cetuximab when combined with chemoradiotherapy (CRT) in rectal cancer and to identify molecular profiles and biomarkers that might improve patient selection for such treatments.Forty-one patients with rectal cancer (T3-4 and/or N+) received preoperative radiotherapy (1.8 Gy, 5 days/wk, 45 Gy) in combination with capecitabine and cetuximab (400 mg/m2 as initial dose 1 week before CRT followed by 250 mg/m2 /wk for 5 weeks). Biopsies and plasma samples were taken before treatment, after cetuximab but before CRT, and at the time of surgery. Proteomics and microarrays were used to monitor the molecular response to cetuximab and to identify profiles and biomarkers to predict treatment efficacy.Cetuximab on its own downregulated genes involved in proliferation and invasion and upregulated inflammatory gene expression, with 16 genes being significantly influenced in microarray analysis. The decrease in proliferation was confirmed by immunohistochemistry for Ki67 (P = .01) and was accompanied by an increase in transforming growth factor-alpha in plasma samples (P < .001). Disease-free survival (DFS) was better in patients if epidermal growth factor receptor expression was upregulated in the tumor after the initial cetuximab dose (P = .02) and when fibro-inflammatory changes were present in the surgical specimen (P = .03). Microarray and proteomic profiles were predictive of DFS.Our study showed that a single dose of cetuximab has a significant impact on the expression of genes involved in tumor proliferation and inflammation. We identified potential biomarkers that might predict response to cetuximab-based CRT.
View details for DOI 10.1200/JCO.2008.18.5033
View details for Web of Science ID 000266782100005
View details for PubMedID 19332731
Prediction of cancer outcome using DNA microarray technology: past, present and future.
Expert opinion on medical diagnostics
2009; 3 (2): 157-165
Background: The use of DNA microarray technology to predict cancer outcome already has a history of almost a decade. Although many breakthroughs have been made, the promise of individualized therapy is still not fulfilled. In addition, new technologies are emerging that also show promise in outcome prediction of cancer patients. Objective: The impact of DNA microarray and other 'omics' technologies on the outcome prediction of cancer patients was investigated. Whether integration of omics data results in better predictions was also examined. Methods: DNA microarray technology was focused on as a starting point because this technology is considered to be the most mature technology from all omics technologies. Next, emerging technologies that may accomplish the same goals but have been less extensively studied are described. Conclusion: Besides DNA microarray technology, other omics technologies have shown promise in predicting the cancer outcome or have potential to replace microarray technology in the near future. Moreover, it is shown that integration of multiple omics data can result in better predictions of cancer outcome; but, owing to the lack of comprehensive studies, validation studies are required to verify which omics has the most information and whether a combination of multiple omics data improves predictive performance.
View details for DOI 10.1517/17530050802680172
View details for PubMedID 23485162
A kernel-based integration of genome-wide data for clinical decision support.
2009; 1 (4): 39-?
Although microarray technology allows the investigation of the transcriptomic make-up of a tumor in one experiment, the transcriptome does not completely reflect the underlying biology due to alternative splicing, post-translational modifications, as well as the influence of pathological conditions (for example, cancer) on transcription and translation. This increases the importance of fusing more than one source of genome-wide data, such as the genome, transcriptome, proteome, and epigenome. The current increase in the amount of available omics data emphasizes the need for a methodological integration framework.We propose a kernel-based approach for clinical decision support in which many genome-wide data sources are combined. Integration occurs within the patient domain at the level of kernel matrices before building the classifier. As supervised classification algorithm, a weighted least squares support vector machine is used. We apply this framework to two cancer cases, namely, a rectal cancer data set containing microarray and proteomics data and a prostate cancer data set containing microarray and genomics data. For both cases, multiple outcomes are predicted.For the rectal cancer outcomes, the highest leave-one-out (LOO) areas under the receiver operating characteristic curves (AUC) were obtained when combining microarray and proteomics data gathered during therapy and ranged from 0.927 to 0.987. For prostate cancer, all four outcomes had a better LOO AUC when combining microarray and genomics data, ranging from 0.786 for recurrence to 0.987 for metastasis.For both cancer sites the prediction of all outcomes improved when more than one genome-wide data set was considered. This suggests that integrating multiple genome-wide data sources increases the predictive performance of clinical decision support models. This emphasizes the need for comprehensive multi-modal data. We acknowledge that, in a first phase, this will substantially increase costs; however, this is a necessary investment to ultimately obtain cost-efficient models usable in patient tailored therapy.
View details for DOI 10.1186/gm39
View details for PubMedID 19356222
Building decision trees for diagnosing intracavitary uterine pathology.
Facts, views & vision in ObGyn
2009; 1 (3): 182-188
To build decision trees to predict intrauterine disease, based on a clinical data set, and using mathematical software.Diagnostic algorithms were built and validated using the data of 402 consecutive patients who underwent grey scale ultrasound, followed by colour Doppler, saline infusion sonography (SIS), office hysteroscopy and endometrial-- sampling. The "final diagnosis" was classified as "abnormal" in case of endometrial polyps, hyperplasia or malignancy or intracavitary myoma. "Pre-test parameters" included patient's age, weight, length, parity, menopausal status, bleeding symptoms and cervical cytology; "post-test parameters" included ultrasound-, color Doppler-, SIS-, hysteroscopy- findings and histology results after endometrial sampling. Decision Tree #1 was built using both "pre-test" and "post-test" parameters; Tree #2 was only based on "post-test" parameters; Tree #3 was designed without using the hysteroscopy variables. The Waikato Environment for Knowledge Analysis (Weka) software was used for the development of decision trees.All trees started with an imaging technique: hysteroscopy or SIS. The diagnostic accuracy was 88.3%, 88.3% and 84.0% for Tree #1, #2 and #3 respectively, the sensitivity and specificity was 95.5% and 82%, 97.7% and 80.0, 93.2 and 76.0%, respectively.The method used in this study enables the comparison between different decision trees containing multiple tests.
View details for PubMedID 25489463
- A kernel-based integration of genome-wide data for clinical decision support GENOME MEDICINE 2009; 1
SUPERVISED CLASSIFICATION OF ARRAY CGH DATA WITH HMM-BASED FEATURE SELECTION
PACIFIC SYMPOSIUM ON BIOCOMPUTING 2009
For different tumour types, extended knowledge about the molecular mechanisms involved in tumorigenesis is lacking. Looking for copy number variations (CNV) by Comparative Genomic Hybridization (CGH) can help however to determine key elements in this tumorigenesis. As genome-wide array CGH gives the opportunity to evaluate CNV at high resolution, this leads to huge amount of data, necessitating adequate mathematical methods to carefully select and interpret these data.Two groups of patients differing in cancer subtype were defined in two publicly available array CGH data sets as well as in our own data set on ovarian cancer. Chromosomal regions characterizing each group of patients were gathered using recurrent hidden Markov Models (HMM). The differential regions were reduced to a subset of features for classification by integrating different univariate feature selection methods. Weighted Least Squares Support Vector Machines (LS-SVM), a supervised classification method which takes unbalancedness of data sets into account, resulted in leave-one-out or 10-fold cross-validation accuracies ranging from 88 to 95.5%.The combination of recurrent HMMs for the detection of copy number alterations with LS-SVM classifiers offers a novel methodological approach for classification based on copy number alterations. Additionally, this approach limits the chromosomal regions that are necessary to classify patients according to cancer subtype.
View details for Web of Science ID 000263639700045
View details for PubMedID 19209723
Pain experienced during transvaginal ultrasound, saline contrast sonohysterography, hysteroscopy and office sampling: a comparative study
ULTRASOUND IN OBSTETRICS & GYNECOLOGY
2008; 31 (3): 346-351
To evaluate and compare the pain experienced by women during transvaginal ultrasound, saline contrast sonohysterography (SCSH), diagnostic hysteroscopy and office sampling.This was a descriptive study of 402 consecutive patients presenting at a 'one-stop' Bleeding Clinic between October 2004 and November 2006. Thirty-nine percent of the patients were postmenopausal. The patients underwent the following examinations transvaginally: first ultrasound with color Doppler, second SCSH, third diagnostic hysteroscopy and fourth endometrial biopsy. After completion of the examinations the patients were asked to complete a questionnaire including a visual analog scale (VAS) about their subjective appreciation of all four examinations. Two-hundred and ninety-three (72%) patients returned the questionnaire.The median (range) VAS scores for transvaginal ultrasound, SCSH, diagnostic hysteroscopy and endometrial sampling were 1.0 (0-8.1), 2.2 (0-10), 2.7 (0-10) and 5.1 (0-10), respectively (P < 0.0001). The patients' answers to the other questions about the pain experienced, including comparison with other minor procedures such as venous blood sampling, were all concordant with the VAS scores.Transvaginal ultrasound was the procedure best accepted, followed by SCSH, hysteroscopy and endometrial sampling. These results suggest that patients would prefer SCSH over hysteroscopy as an initial diagnostic approach in the evaluation of abnormal uterine bleeding.
View details for DOI 10.1002/uog.5263
View details for Web of Science ID 000254541900019
View details for PubMedID 18307203
Expression profiling to predict the clinical behaviour of ovarian cancer fails independent evaluation
In a previously published pilot study we explored the performance of microarrays in predicting clinical behaviour of ovarian tumours. For this purpose we performed microarray analysis on 20 patients and estimated that we could predict advanced stage disease with 100% accuracy and the response to platin-based chemotherapy with 76.92% accuracy using leave-one-out cross validation techniques in combination with Least Squares Support Vector Machines (LS-SVMs).In the current study we evaluate whether tumour characteristics in an independent set of 49 patients can be predicted using the pilot data set with principal component analysis or LS-SVMs.The results of the principal component analysis suggest that the gene expression data from stage I, platin-sensitive advanced stage and platin-resistant advanced stage tumours in the independent data set did not correspond to their respective classes in the pilot study. Additionally, LS-SVM models built using the data from the pilot study - although they only misclassified one of four stage I tumours and correctly classified all 45 advanced stage tumours - were not able to predict resistance to platin-based chemotherapy. Furthermore, models based on the pilot data and on previously published gene sets related to ovarian cancer outcomes, did not perform significantly better than our models.We discuss possible reasons for failure of the model for predicting response to platin-based chemotherapy and conclude that existing results based on gene expression patterns of ovarian tumours need to be thoroughly scrutinized before these results can be accepted to reflect the true performance of microarray technology.
View details for DOI 10.1186/1471-2407-8-18
View details for Web of Science ID 000253596800002
View details for PubMedID 18211668
Integrating microarray and proteomics data to predict the response on cetuximab in patients with rectal cancer.
Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing
To investigate the combination of cetuximab, capecitabine and radiotherapy in the preoperative treatment of patients with rectal cancer, fourty tumour samples were gathered before treatment (T0), after one dose of cetuximab but before radiotherapy with capecitabine (T1) and at moment of surgery (T2). The tumour and plasma samples were subjected at all timepoints to Affymetrix microarray and Luminex proteomics analysis, respectively. At surgery, the Rectal Cancer Regression Grade (RCRG) was registered. We used a kernel-based method with Least Squares Support Vector Machines to predict RCRG based on the integration of microarray and proteomics data on To and T1. We demonstrated that combining multiple data sources improves the predictive power. The best model was based on 5 genes and 10 proteins at T0 and T1 and could predict the RCRG with an accuracy of 91.7%, sensitivity of 96.2% and specificity of 80%.
View details for PubMedID 18229684
Integration of microarray and textual data improves the prognosis prediction of breast, lung and ovarian cancer patients.
Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing
Microarray data are notoriously noisy such that models predicting clinically relevant outcomes often contain many false positive genes. Integration of other data sources can alleviate this problem and enhance gene selection and model building. Probabilistic models provide a natural solution to integrate information by using the prior over model space. We investigated if the use of text information from PUBMED abstracts in the structure prior of a Bayesian network could improve the prediction of the prognosis in cancer. Our results show that prediction of the outcome with the text prior was significantly better compared to not using a prior, both on a well known microarray data set and on three independent microarray data sets.
View details for PubMedID 18229693
A framework for elucidating regulatory networks based on prior information and expression data
REVERSE ENGINEERING BIOLOGICAL NETWORKS
2007; 1115: 240-248
Elucidating regulatory networks is an intensively studied topic in bioinformatics. Integration of different sources of information could facilitate this task. We propose to incorporate these information sources in the structure prior of a Bayesian network. We are currently investigating two complementary sources of information: PubMed abstracts combined with publicly available taxonomies or ontologies, and known protein-DNA interactions. These priors, either separately or combined, have the potential of reducing the complexity of reverse-engineering regulatory networks while creating more robust and reliable models. Moreover this approach can easily be extended with other data sources. In such a way Bayesian networks provide a powerful framework for data integration and regulatory network modeling.
View details for DOI 10.1196/annals.1407.002
View details for Web of Science ID 000252037600017
View details for PubMedID 17925352
Integration of clinical and microarray data with kernel methods
2007 ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOLS 1-16
Currently, the clinical management of cancer is based on empirical data from the literature (clinical studies) or based on the expertise of the clinician. Recently microarray technology emerged and it has the potential to revolutionize the clinical management of cancer and other diseases. A microarray allows to measure the expression levels of thousands of genes simultaneously which may reflect diagnostic or prognostic categories and sensitivity to treatment. The objective of this paper is to investigate whether clinical data, which is the basis of day-to-day clinical decision support, can be efficiently combined with microarray data, which has yet to prove its potential to deliver patient tailored therapy, using Least Squares Support Vector Machines.
View details for Web of Science ID 000253467004088
View details for PubMedID 18003232
- Molecular profiling of platinum resistant ovarian cancer: Use of the model in clinical practice INTERNATIONAL JOURNAL OF CANCER 2006; 119 (6): 1511-1511
Predicting the prognosis of breast cancer by integrating clinical and microarray data with Bayesian networks
2006; 22 (14): E184-E190
Clinical data, such as patient history, laboratory analysis, ultrasound parameters--which are the basis of day-to-day clinical decision support--are often underused to guide the clinical management of cancer in the presence of microarray data. We propose a strategy based on Bayesian networks to treat clinical and microarray data on an equal footing. The main advantage of this probabilistic model is that it allows to integrate these data sources in several ways and that it allows to investigate and understand the model structure and parameters. Furthermore using the concept of a Markov Blanket we can identify all the variables that shield off the class variable from the influence of the remaining network. Therefore Bayesian networks automatically perform feature selection by identifying the (in)dependency relationships with the class variable.We evaluated three methods for integrating clinical and microarray data: decision integration, partial integration and full integration and used them to classify publicly available data on breast cancer patients into a poor and a good prognosis group. The partial integration method is most promising and has an independent test set area under the ROC curve of 0.845. After choosing an operating point the classification performance is better than frequently used indices.
View details for DOI 10.1093/bioinformatics/btl230
View details for Web of Science ID 000250005000023
View details for PubMedID 16873470
Predicting the outcome of pregnancies of unknown location: Bayesian networks with expert prior information compared to logistic regression
2006; 21 (7): 1824-1831
As women present at earlier gestations to early pregnancy units (EPUs), the number of women diagnosed with a pregnancy of unknown location (PUL) increases. Some of these women will have an ectopic pregnancy (EP), and it is this group in the PUL population that poses the greatest concern. The aim of this study was to develop Bayesian networks to predict EPs in the PUL population.Data were gathered in a single EPU from all women with a PUL. This data set was divided into a model-building (599 women with 44 EPs) and a validation (257 women with 22 EPs) data set and consisted of the following variables: vaginal bleeding, fluid in the pouch of Douglas, midline echo, lower abdominal pain, age, endometrial thickness, gestation days, the ratio of HCG at 48 and 0 h, progesterone levels (0 and 48 h) and the clinical outcome of the PUL. We developed Bayesian networks with expert information using this data set to predict EPs.The best Bayesian network used the gestational age, HCG ratio and the progesterone level at 48 h and had an area under the receiver operator characteristic curve (AUC) of 0.88 for predicting EPs when tested prospectively.Discrete-valued Bayesian networks are more complex to build than, for example, logistic regression. Nevertheless, we have demonstrated that such models can be used to predict EPs in a PUL population. Prospective interventional multicentre studies are needed to validate the use of such models in clinical practice.
View details for DOI 10.1093/humrep/del083
View details for Web of Science ID 000238907400027
View details for PubMedID 16601010
Diagnostic accuracy of varying discriminatory zones for the prediction of ectopic pregnancy in women with a pregnancy of unknown location
ULTRASOUND IN OBSTETRICS & GYNECOLOGY
2005; 26 (7): 770-775
Various serum human chorionic gonadotropin (hCG) discriminatory zones are currently used for evaluating the likelihood of an ectopic pregnancy in women classified as having a pregnancy of unknown location (PUL) following a transvaginal ultrasound examination. We evaluated the diagnostic accuracy of discriminatory zones for serum hCG levels of > 1000 IU/L, 1500 IU/L and 2000 IU/L for the detection of ectopic pregnancy in such women.This was a prospective observational study of women who were assessed in a specialized transvaginal scanning unit. All women with a PUL had serum hCG measured at presentation. Expectant management of PULs was adopted. These women were followed up with transvaginal ultrasound, monitoring of serum hormone levels and laparoscopy until a final diagnosis was established: a failing PUL, an intrauterine pregnancy (IUP), an ectopic pregnancy or a persisting PUL. The persisting PULs probably represented ectopic pregnancies which had been missed on ultrasound and these were incorporated into the ectopic pregnancy group. Three different discriminatory zones (1000 IU/L, 1500 IU/L and 2000 IU/L) were evaluated for predicting ectopic pregnancy in this PUL population.A total of 5544 consecutive women presented to the early pregnancy unit between 25 June 2001 and 14 April 2003. Of these, 569 (10.3%) women were classified as having a PUL, 42 of which were lost to follow up. Of the 527 (9.5%) cases with PUL analyzed, there were 300 (56.9%) failing PULs, 181 (34.3%) IUPs and 46 (8.7%) ectopic pregnancies. Overall, 74.6% were symptomatic and 25.4% were asymptomatic (P = 8.825E-07). The sensitivity and specificity of an hCG level of > 1000 IU/L to detect ectopic pregnancy were 21.7% (10/46) and 87.3% (420/481), respectively; for an hCG level of > 1500 IU/L these values were 15.2% (7/46) and 93.4% (449/481), respectively, and for an hCG level of > 2000 IU/L they were 10.9% (5/46) and 95.2% (458/481), respectively.Varying the discriminatory zone does not significantly improve the detection of ectopic pregnancy in a PUL population. A single measurement of serum hCG is not only potentially falsely reassuring but also unhelpful in excluding the presence of an ectopic pregnancy.
View details for DOI 10.1002/uog.2636
View details for Web of Science ID 000234027800015
View details for PubMedID 16308901