Dr. Youssef is a postdoctoral fellow in the Department of Radiology at Stanford University School of Medicine. She received her Doctor of Philosophy degree in Data Science and Population Health from the Institute of Medical Science, University of Toronto, Canada in 2021. Her research addresses ethical considerations in AI development, aiming to promote responsible use of AI in healthcare. Using mixed-methods methodologies, she investigates the end-user experience with AI systems, identifying ethical and safety concerns related to integrating AI into clinical workflows. Dr. Youssef leads several AI educational programs and policy initiatives. She co-directs the Stanford AIMI High School Programs, preparing the next generation for careers that intersect AI and medicine. She also serves on several AI policy and education committees across the Stanford School of Medicine.


Honors & Awards

  • 2022 Stanford Postdoc JEDI Champion Award, Office of Postdoctoral Affairs - Stanford University (09/20/22)

Boards, Advisory Committees, Professional Organizations

  • Post-doctoral Representative, Committee for Research - Faculty Senate, Stanford University (2021 - 2022)

Stanford Advisors

Lab Affiliations

All Publications

  • Perceptions of Data Set Experts on Important Characteristics of Health Data Sets Ready for Machine Learning: A Qualitative Study. JAMA network open Ng, M. Y., Youssef, A., Miner, A. S., Sarellano, D., Long, J., Larson, D. B., Hernandez-Boussard, T., Langlotz, C. P. 2023; 6 (12): e2345892


    The lack of data quality frameworks to guide the development of artificial intelligence (AI)-ready data sets limits their usefulness for machine learning (ML) research in health care and hinders the diagnostic excellence of developed clinical AI applications for patient care.To discern what constitutes high-quality and useful data sets for health and biomedical ML research purposes according to subject matter experts.This qualitative study interviewed data set experts, particularly those who are creators and ML researchers. Semistructured interviews were conducted in English and remotely through a secure video conferencing platform between August 23, 2022, and January 5, 2023. A total of 93 experts were invited to participate. Twenty experts were enrolled and interviewed. Using purposive sampling, experts were affiliated with a diverse representation of 16 health data sets/databases across organizational sectors. Content analysis was used to evaluate survey information and thematic analysis was used to analyze interview data.Data set experts' perceptions on what makes data sets AI ready.Participants included 20 data set experts (11 [55%] men; mean [SD] age, 42 [11] years), of whom all were health data set creators, and 18 of the 20 were also ML researchers. Themes (3 main and 11 subthemes) were identified and integrated into an AI-readiness framework to show their association within the health data ecosystem. Participants partially determined the AI readiness of data sets using priority appraisal elements of accuracy, completeness, consistency, and fitness. Ethical acquisition and societal impact emerged as appraisal considerations in that participant samples have not been described to date in prior data quality frameworks. Factors that drive creation of high-quality health data sets and mitigate risks associated with data reuse in ML research were also relevant to AI readiness. The state of data availability, data quality standards, documentation, team science, and incentivization were associated with elements of AI readiness and the overall perception of data set usefulness.In this qualitative study of data set experts, participants contributed to the development of a grounded framework for AI data set quality. Data set AI readiness required the concerted appraisal of many elements and the balancing of transparency and ethical reflection against pragmatic constraints. The movement toward more reliable, relevant, and ethical AI and ML applications for patient care will inevitably require strategic updates to data set creation practices.

    View details for DOI 10.1001/jamanetworkopen.2023.45892

    View details for PubMedID 38039004

  • Organizational Factors in Clinical Data Sharing for Artificial Intelligence in Health Care. JAMA network open Youssef, A., Ng, M. Y., Long, J., Hernandez-Boussard, T., Shah, N., Miner, A., Larson, D., Langlotz, C. P. 2023; 6 (12): e2348422


    Limited sharing of data sets that accurately represent disease and patient diversity limits the generalizability of artificial intelligence (AI) algorithms in health care.To explore the factors associated with organizational motivation to share health data for AI development.This qualitative study investigated organizational readiness for sharing health data across the academic, governmental, nonprofit, and private sectors. Using a multiple case studies approach, 27 semistructured interviews were conducted with leaders in data-sharing roles from August 29, 2022, to January 9, 2023. The interviews were conducted in the English language using a video conferencing platform. Using a purposive and nonprobabilistic sampling strategy, 78 individuals across 52 unique organizations were identified. Of these, 35 participants were enrolled. Participant recruitment concluded after 27 interviews, as theoretical saturation was reached and no additional themes emerged.Concepts defining organizational readiness for data sharing and the association between data-sharing factors and organizational behavior were mapped through iterative qualitative analysis to establish a framework defining organizational readiness for sharing clinical data for AI development.Interviews included 27 leaders from 18 organizations (academia: 10, government: 7, nonprofit: 8, and private: 2). Organizational readiness for data sharing centered around 2 main constructs: motivation and capabilities. Motivation related to the alignment of an organization's values with data-sharing priorities and was associated with its engagement in data-sharing efforts. However, organizational motivation could be modulated by extrinsic incentives for financial or reputational gains. Organizational capabilities comprised infrastructure, people, expertise, and access to data. Cross-sector collaboration was a key strategy to mitigate barriers to access health data.This qualitative study identified sector-specific factors that may affect the data-sharing behaviors of health organizations. External incentives may bolster cross-sector collaborations by helping overcome barriers to accessing health data for AI development. The findings suggest that tailored incentives may boost organizational motivation and facilitate sustainable flow of health data for AI development.

    View details for DOI 10.1001/jamanetworkopen.2023.48422

    View details for PubMedID 38113040

  • Inter-institutional data-driven education research: consensus values, principles, and recommendations to guide the ethical sharing of administrative education data in the Canadian medical education research context. Canadian medical education journal Grierson, L., Cavanagh, A., Youssef, A., Lee-Krueger, R., McNeill, K., Button, B., Kulasegaram, K. 2023; 14 (5): 113-120


    Background: Administrative data are generated when educating, licensing, and regulating future physicians but these data are rarely used beyond their pre-specified purposes. The capacity necessary for sensitive and responsive oversight that supports the sharing of administrative medical education data across institutions for research purposes needs to be developed.Method: A pan-Canadian consensus-building project was undertaken to develop agreement on the goals, benefits, risks, values, and principles that should underpin inter-institutional data-driven medical education research in Canada. A survey of key literature, consultations with various stakeholders and five successive knowledge synthesis workshops informed this project. Propositions were developed, driving subsequent discussions until collective agreement was distilled.Results: Consensus coalesced around six key principles: establishing clear purposes, rationale, and methodology for inter-institutional data-driven research a priori; informed consent from data generators in education systems is non-negotiable; multi-institutional data sharing requires special governance; data governance should be guided by data sovereignty; data use should be guided by an identified set of shared values; and best practices in research data-management should be applied.Conclusion: We recommend establishing a representative governance body, engaging trusted data facility, and adherence to extant data management policies when sharing administrative medical education data for research purposes in Canada.

    View details for DOI 10.36834/cmej.75874

    View details for PubMedID 38045068

  • The Importance of Understanding Language in Large Language Models. The American journal of bioethics : AJOB Youssef, A., Stein, S., Clapp, J., Magnus, D. 2023; 23 (10): 6-7

    View details for DOI 10.1080/15265161.2023.2256614

    View details for PubMedID 37812091

  • Is the Algorithm Good in a Bad World, or Has It Learned to be Bad? The Ethical Challenges of "Locked" Versus "Continuously Learning" and "Autonomous" Versus "Assistive" AI Tools in Healthcare. The American journal of bioethics : AJOB Youssef, A., Abramoff, M., Char, D. 2023; 23 (5): 43-45

    View details for DOI 10.1080/15265161.2023.2191052

    View details for PubMedID 37130390