Selen Bozkurt is a postdoctoral scholar at Stanford University, Biomedical Data Science Department and Center for Biomedical Informatics Research. Her research area and interests have focused on health informatics research using electronic health records, machine learning and natural language processing. She also has work experience as a biostatistician in several projects. She is a member of RSNA Radiology Reporting Committee since 2009. Her PhD dissertation work was entitled "A Real Time Decision Support System for Mammography Interpretations" in which she developed an automated system for deep information extraction from mammography reports and an approach for real-time decision support driven by analysis of dictated radiology reports.

Professional Education

  • PhD, Akdeniz University, Faculty of Medicine, Biostatistics and Medical Informatics
  • Visiting PhD Student, Stanford University, Biomedical Informatics
  • MSc, Akdeniz University, Faculty of Medicine, Biostatistics and Medical Informatics
  • BSc, Dokuz Eylul University, Statistics

Lab Affiliations

All Publications

  • Automatic inference of BI-RADS final assessment categories from narrative mammography report findings Journal of Biomedical Informatics Banerjee, I., Bozkurt, S., Alkim, E., Sagreiya, H., Kurian, A. W., Rubin, D. L. 2019
  • Impact of age on intermittent hypoxia in obstructive sleep apnea: a propensity-matched analysis SLEEP AND BREATHING Bostanci, A., Bozkurt, S., Turhan, M. 2018; 22 (2): 317–22


    To determine independent relationship of aging with chronic intermittent hypoxia, we compared hypoxia-related polysomnographic variables of geriatric patients (aged ≥ 65 years) with an apnea-hypopnea index (AHI)-, gender-, body mass index (BMI)-, and neck circumference-matched cohort of non-geriatric patients.The study was conducted using clinical and polysomnographic data of 1280 consecutive patients who underwent complete polysomnographic evaluation for suspected sleep-disordered breathing (SDB) at a single sleep disorder center. A propensity score-matched analysis was performed to obtain matched cohorts of geriatric and non-geriatric patients, which resulted in successful matching of 168 patients from each group.Study groups were comparable for gender (P = 0.999), BMI (P = 0.940), neck circumference (P = 0.969), AHI (P = 0.935), and severity of SDB (P = 0.089). The oximetric variables representing the duration of chronic intermittent hypoxia such as mean (P = 0.001), the longest (P = 0.001) and total apnea durations (P = 0.003), mean (P = 0.001) and the longest hypopnea durations (P = 0.001), and total sleep time with oxygen saturation below 90% (P = 0.008) were significantly higher in the geriatric patients as compared with younger adults. Geriatric patients had significantly lower minimum (P = 0.013) and mean oxygen saturation (P = 0.001) than non-geriatric patients.The study provides evidence that elderly patients exhibit more severe and deeper nocturnal intermittent hypoxia than the younger adults, independent of severity of obstructive sleep apnea, BMI, gender, and neck circumference. Hypoxia-related polysomnographic variables in geriatric patients may in fact reflect a physiological aging process rather than the severity of a SDB.

    View details for DOI 10.1007/s11325-017-1560-z

    View details for Web of Science ID 000430993000006

    View details for PubMedID 28849299

  • Distribution of global health measures from routinely collected PROMIS surveys in patients with breast cancer or prostate cancer. Cancer Seneviratne, M. G., Bozkurt, S., Patel, M. I., Seto, T., Brooks, J. D., Blayney, D. W., Kurian, A. W., Hernandez-Boussard, T. 2018


    The collection of patient-reported outcomes (PROs) is an emerging priority internationally, guiding clinical care, quality improvement projects and research studies. After the deployment of Patient-Reported Outcomes Measurement Information System (PROMIS) surveys in routine outpatient workflows at an academic cancer center, electronic health record data were used to evaluate survey completion rates and self-reported global health measures across 2 tumor types: breast and prostate cancer.This study retrospectively analyzed 11,657 PROMIS surveys from patients with breast cancer and 4411 surveys from patients with prostate cancer, and it calculated survey completion rates and global physical health (GPH) and global mental health (GMH) scores between 2013 and 2018.A total of 36.6% of eligible patients with breast cancer and 23.7% of patients with prostate cancer completed at least 1 survey, with completion rates lower among black patients for both tumor types (P < .05). The mean T scores (calibrated to a general population mean of 50) for GPH were 48.4 ± 9 for breast cancer and 50.6 ± 9 for prostate cancer, and the GMH scores were 52.7 ± 8 and 52.1 ± 9, respectively. GPH and GMH were frequently lower among ethnic minorities, patients without private health insurance, and those with advanced disease.This analysis provides important baseline data on patient-reported global health in breast and prostate cancer. Demonstrating that PROs can be integrated into clinical workflows, this study shows that supportive efforts may be needed to improve PRO collection and global health endpoints in vulnerable populations.

    View details for DOI 10.1002/cncr.31895

    View details for PubMedID 30512191

  • An Automated Feature Engineering for Digital Rectal Examination Documentation using Natural Language Processing. AMIA ... Annual Symposium proceedings. AMIA Symposium Bozkurt, S., Park, J. I., Kan, K. M., Ferrari, M., Rubin, D. L., Brooks, J. D., Hernandez-Boussard, T. 2018; 2018: 288–94


    Digital rectal examination (DRE) is considered a quality metric for prostate cancer care. However, much of the DRE related rich information is documented as free-text in clinical narratives. Therefore, we aimed to develop a natural language processing (NLP) pipeline for automatic documentation of DRE in clinical notes using a domain-specific dictionary created by clinical experts and an extended version of the same dictionary learned by clinical notes using distributional semantics algorithms. The proposed pipeline was compared to a baseline NLP algorithm and the results of the proposed pipeline were found superior in terms of precision (0.95) and recall (0.90) for documentation of DRE. We believe the rule-based NLP pipeline enriched with terms learned from the whole corpus can provide accurate and efficient identification of this quality metric.

    View details for PubMedID 30815067

  • Impact of coexistent adenomyosis on outcomes of patients with endometrioid endometrial cancer: a propensity score-matched analysis TUMORI J Aydin, H., Toptas, T., Bozkurt, S., Pestereli, E., Simsek, T. 2018; 104 (1): 60–65


    Despite the common occurrence of adenomyosis in endometrial cancer (EC), there is a paucity and conflict in the literature regarding its impact on outcomes of patients. We sought to compare outcomes of patients with endometrioid type EC with or without adenomyosis.A total of 314 patients were included in the analysis. Patients were divided into 2 groups according to the presence or absence of adenomyosis. Adenomyosis was identified in 79 patients (25.1%). A propensity score-matched comparison (1:1) was carried out to minimize selection biases. The propensity score was developed through multivariable logistic regression model including age, stage, and tumor grade as covariates. After performing propensity score matching, 70 patients from each group were successfully matched. Primary outcome of the study was disease-free survival (DFS), and the secondary outcomes were overall survival (OS) and disease-specific survival (DSS).Median follow-up time was 61 months for the adenomyosis positive group and 76 months for the adenomyosis negative group. There were no statistically significant differences in 3- and 5-year DFS, OS, and DSS rates between the 2 groups. Five-year DFS was 92% vs 88% (hazard ratio [HR] 1.54 [0.56-4.27]; p = 0.404), 5-year OS was 94% vs 92% (HR 1.60 [0.49-5.26]; p = 0.441), and 5-year DSS was 94% vs 96% (HR 2.51 [0.46-13.71]; p = 0.290) for patients with and without adenomyosis, respectively.Coexistent adenomyosis in EC is not a prognostic factor and does not impact survival outcomes.

    View details for DOI 10.5301/tj.5000698

    View details for Web of Science ID 000434682400009

    View details for PubMedID 29192745

  • Expanding a radiology lexicon using contextual patterns in radiology reports. Journal of the American Medical Informatics Association : JAMIA Percha, B., Zhang, Y., Bozkurt, S., Rubin, D., Altman, R. B., Langlotz, C. P. 2018


    Distributional semantics algorithms, which learn vector space representations of words and phrases from large corpora, identify related terms based on contextual usage patterns. We hypothesize that distributional semantics can speed up lexicon expansion in a clinical domain, radiology, by unearthing synonyms from the corpus.We apply word2vec, a distributional semantics software package, to the text of radiology notes to identify synonyms for RadLex, a structured lexicon of radiology terms. We stratify performance by term category, term frequency, number of tokens in the term, vector magnitude, and the context window used in vector building.Ranking candidates based on distributional similarity to a target term results in high curation efficiency: on a ranked list of 775 249 terms, >50% of synonyms occurred within the first 25 terms. Synonyms are easier to find if the target term is a phrase rather than a single word, if it occurs at least 100× in the corpus, and if its vector magnitude is between 4 and 5. Some RadLex categories, such as anatomical substances, are easier to identify synonyms for than others.The unstructured text of clinical notes contains a wealth of information about human diseases and treatment patterns. However, searching and retrieving information from clinical notes often suffer due to variations in how similar concepts are described in the text. Biomedical lexicons address this challenge, but are expensive to produce and maintain. Distributional semantics algorithms can assist lexicon curation, saving researchers time and money.

    View details for DOI 10.1093/jamia/ocx152

    View details for PubMedID 29329435

  • Can Statistical Machine Learning Algorithms Help for Classification of Obstructive Sleep Apnea Severity to Optimal Utilization of Polysomnography Resources? Methods of information in medicine Bozkurt, S., Bostanci, A., Turhan, M. 2017; 56 (4)


    The goal of this study is to evaluate the results of machine learning methods for the classification of OSA severity of patients with suspected sleep disorder breathing as normal, mild, moderate and severe based on non-polysomnographic variables: 1) clinical data, 2) symptoms and 3) physical examination.In order to produce classification models for OSA severity, five different machine learning methods (Bayesian network, Decision Tree, Random Forest, Neural Networks and Logistic Regression) were trained while relevant variables and their relationships were derived empirically from observed data. Each model was trained and evaluated using 10-fold cross-validation and to evaluate classification performances of all methods, true positive rate (TPR), false positive rate (FPR), Positive Predictive Value (PPV), F measure and Area Under Receiver Operating Characteristics curve (ROC-AUC) were used.Results of 10-fold cross validated tests with different variable settings promisingly indicated that the OSA severity of suspected OSA patients can be classified, using non-polysomnographic features, with 0.71 true positive rate as the highest and, 0.15 false positive rate as the lowest, respectively. Moreover, the test results of different variables settings revealed that the accuracy of the classification models was significantly improved when physical examination variables were added to the model.Study results showed that machine learning methods can be used to estimate the probabilities of no, mild, moderate, and severe obstructive sleep apnea and such approaches may improve accurate initial OSA screening and help referring only the suspected moderate or severe OSA patients to sleep laboratories for the expensive tests.

    View details for DOI 10.3414/ME16-01-0084

    View details for PubMedID 28590499

  • Usability Study of RSNA Radiology Reporting Template Library. Studies in health technology and informatics Hong, Y., Zhu, Y., Bozkurt, S., Zhang, J., Kahn, C. E. 2017; 245: 1325


    This study provides insights that could help to improve the Radiological Society of North America (RSNA) Reporting Template Digital Library, based on a usability evaluation. The results show that most users have been satisfied with the website. The general comments for the library are positive, although the participants suggested quite a few areas to improve. About 40% are returning visitors which means people often come back to the website.

    View details for PubMedID 29295406

  • Estimation of cardiovascular disease from polysomnographic parameters in sleep-disordered breathing EUROPEAN ARCHIVES OF OTO-RHINO-LARYNGOLOGY Turhan, M., Bostanci, A., Bozkurt, S. 2016; 273 (12): 4585-4593


    We aimed to illustrate the causal relationships between cardiovascular diseases (CVDs) and various polysomnographic variables, and to develop a CVD estimation model from these variables in a population referred for assessment of possible sleep-disordered breathing (SDB). Clinical and polysomnographic data of 1162 consecutive patients with suspected SDB whose comorbidity status was known, were reviewed, retrospectively. Variable selection was performed in two steps using univariate analysis and tenfold cross validation information gain analysis. The resulting set of variables with an average merit value (m) of >0.005 was considered to be causal factors contributing to the CVDs, and used in Bayesian network models for providing estimations. Of the 1162 patients, 234 had CVDs (20.1 %). In total, 28 parameters were evaluated for variable selection. Of those, 19 were found to be associated with CVDs. Age was the most effective attribute in estimating CVD (m = 0.051), followed by total sleep time with oxygen saturation <90 % (m = 0.021). Some other important variables were apnea-hypopnea index during non-rapid eye movement (m = 0.018), lowest oxygen saturation (m = 0.018), body mass index (m = 0.016), total apnea duration (m = 0.014), mean apnea duration (m = 0.014), longest apnea duration (m = 0.013), and severity of SDB (m = 0.012). The modeling process resulted in a final model, with 76.9 % sensitivity, 96.2 % specificity, and 92.6 % negative predictive value, consisting of all selected variables. The study provides evidence that the estimation of CVDs from polysomnographic parameters is possible with high predictive performance using Bayesian network analysis.

    View details for DOI 10.1007/s00405-016-4176-1

    View details for Web of Science ID 000387700400066

    View details for PubMedID 27363409

  • Using automatically extracted information from mammography reports for decision-support. Journal of biomedical informatics Bozkurt, S., Gimenez, F., Burnside, E. S., Gulkesen, K. H., Rubin, D. L. 2016; 62: 224-231


    To evaluate a system we developed that connects natural language processing (NLP) for information extraction from narrative text mammography reports with a Bayesian network for decision-support about breast cancer diagnosis. The ultimate goal of this system is to provide decision support as part of the workflow of producing the radiology report.We built a system that uses an NLP information extraction system (which extract BI-RADS descriptors and clinical information from mammography reports) to provide the necessary inputs to a Bayesian network (BN) decision support system (DSS) that estimates lesion malignancy from BI-RADS descriptors. We used this integrated system to predict diagnosis of breast cancer from radiology text reports and evaluated it with a reference standard of 300 mammography reports. We collected two different outputs from the DSS: (1) the probability of malignancy and (2) the BI-RADS final assessment category. Since NLP may produce imperfect inputs to the DSS, we compared the difference between using perfect ("reference standard") structured inputs to the DSS ("RS-DSS") vs NLP-derived inputs ("NLP-DSS") on the output of the DSS using the concordance correlation coefficient. We measured the classification accuracy of the BI-RADS final assessment category when using NLP-DSS, compared with the ground truth category established by the radiologist.The NLP-DSS and RS-DSS had closely matched probabilities, with a mean paired difference of 0.004±0.025. The concordance correlation of these paired measures was 0.95. The accuracy of the NLP-DSS to predict the correct BI-RADS final assessment category was 97.58%.The accuracy of the information extracted from mammography reports using the NLP system was sufficient to provide accurate DSS results. We believe our system could ultimately reduce the variation in practice in mammography related to assessment of malignant lesions and improve management decisions.

    View details for DOI 10.1016/j.jbi.2016.07.001

    View details for PubMedID 27388877

  • Automatic abstraction of imaging observations with their characteristics from mammography reports. Journal of the American Medical Informatics Association Bozkurt, S., Lipson, J. A., Senol, U., Rubin, D. L., Bulu, H. 2015; 22 (e1): e81-92


    Radiology reports are usually narrative, unstructured text, a format which hinders the ability to input report contents into decision support systems. In addition, reports often describe multiple lesions, and it is challenging to automatically extract information on each lesion and its relationships to characteristics, anatomic locations, and other information that describes it. The goal of our work is to develop natural language processing (NLP) methods to recognize each lesion in free-text mammography reports and to extract its corresponding relationships, producing a complete information frame for each lesion.We built an NLP information extraction pipeline in the General Architecture for Text Engineering (GATE) NLP toolkit. Sequential processing modules are executed, producing an output information frame required for a mammography decision support system. Each lesion described in the report is identified by linking it with its anatomic location in the breast. In order to evaluate our system, we selected 300 mammography reports from a hospital report database.The gold standard contained 797 lesions, and our system detected 815 lesions (780 true positives, 35 false positives, and 17 false negatives). The precision of detecting all the imaging observations with their modifiers was 94.9, recall was 90.9, and the F measure was 92.8.Our NLP system extracts each imaging observation and its characteristics from mammography reports. Although our application focuses on the domain of mammography, we believe our approach can generalize to other domains and may narrow the gap between unstructured clinical report text and structured information extraction needed for data mining and decision support.

    View details for DOI 10.1136/amiajnl-2014-003009

    View details for PubMedID 25352567

  • Automated detection of ambiguity in BI-RADS assessment categories in mammography reports. Studies in health technology and informatics Bozkurt, S., Rubin, D. 2014; 197: 35-39


    An unsolved challenge in biomedical natural language processing (NLP) is detecting ambiguities in the reports that can help physicians to improve report clarity. Our goal was to develop NLP methods to tackle the challenges of identifying ambiguous descriptions of the laterality of BI-RADS Final Assessment Categories in mammography radiology reports. We developed a text processing system that uses a BI-RADS ontology we built as a knowledge source for automatic annotation of the entities in mammography reports relevant to this problem. We used the GATE NLP toolkit and developed customized processing resources for report segmentation, named entity recognition, and detection of mismatches between BI-RADS Final Assessment Categories and mammogram laterality. Our system detected 55 mismatched cases in 190 reports and the accuracy rate was 81%. We conclude that such NLP techniques can detect ambiguities in mammography reports and may reduce discrepancy and variability in reporting.

    View details for PubMedID 24743074

  • Annotation for Information Extraction from Mammography Reports INFORMATICS, MANAGEMENT AND TECHNOLOGY IN HEALTHCARE Bozkurt, S., Gulkesen, K. H., Rubin, D. 2013; 190: 183-185


    Inter and intra-observer variability in mammographic interpretation is a challenging problem, and decision support systems (DSS) may be helpful to reduce variation in practice. Since radiology reports are created as unstructured text reports, Natural language processing (NLP) techniques are needed to extract structured information from reports in order to provide the inputs to DSS. Before creating NLP systems, producing high quality annotated data set is essential. The goal of this project is to develop an annotation schema to guide the information extraction tasks needed from free-text mammography reports.

    View details for DOI 10.3233/978-1-61499-276-9-183

    View details for Web of Science ID 000341032900053

    View details for PubMedID 23823416

  • An Open-Standards Grammar for Outline-Style Radiology Report Templates JOURNAL OF DIGITAL IMAGING Bozkurt, S., Kahn, C. E. 2012; 25 (3): 359-364


    Structured reporting uses consistent ordering of results and standardized terminology to improve the quality and reduce the complexity of radiology reports. We sought to define a generalized approach for radiology reporting that produces flexible outline-style reports, accommodates structured information and named reporting elements, allows reporting terms to be linked to controlled vocabularies, uses existing informatics standards, and allows structured report data to be extracted readily. We applied the Regular Language for XML-Next Generation (RELAX NG) schema language to create templates for 110 reporting templates created as part of the Radiological Society of North America reporting initiative. We evaluated how well this approach addressed the project's goals. The RELAX NG schema language expressed the cardinality and hierarchical relationships of reporting concepts, and allowed reporting elements to be mapped to terms in controlled medical vocabularies, such as RadLex®, Systematized Nomenclature of Medicine Clinical Terms®, and Logical Observation Identifiers Names and Codes®. The approach provided extensibility and accommodated the addition of new features. Overall, the approach has proven to be useful and will form the basis for a supplement to the Digital Imaging and Communication in Medicine Standard.

    View details for DOI 10.1007/s10278-012-9456-8

    View details for Web of Science ID 000304109700007

    View details for PubMedID 22258732

    View details for PubMedCentralID PMC3348985