All Publications


  • Red teaming ChatGPT in medicine to yield real-world insights on model behavior. NPJ digital medicine Chang, C. T., Farah, H., Gui, H., Rezaei, S. J., Bou-Khalil, C., Park, Y. J., Swaminathan, A., Omiye, J. A., Kolluri, A., Chaurasia, A., Lozano, A., Heiman, A., Jia, A. S., Kaushal, A., Jia, A., Iacovelli, A., Yang, A., Salles, A., Singhal, A., Narasimhan, B., Belai, B., Jacobson, B. H., Li, B., Poe, C. H., Sanghera, C., Zheng, C., Messer, C., Kettud, D. V., Pandya, D., Kaur, D., Hla, D., Dindoust, D., Moehrle, D., Ross, D., Chou, E., Lin, E., Haredasht, F. N., Cheng, G., Gao, I., Chang, J., Silberg, J., Fries, J. A., Xu, J., Jamison, J., Tamaresis, J. S., Chen, J. H., Lazaro, J., Banda, J. M., Lee, J. J., Matthys, K. E., Steffner, K. R., Tian, L., Pegolotti, L., Srinivasan, M., Manimaran, M., Schwede, M., Zhang, M., Nguyen, M., Fathzadeh, M., Zhao, Q., Bajra, R., Khurana, R., Azam, R., Bartlett, R., Truong, S. T., Fleming, S. L., Raj, S., Behr, S., Onyeka, S., Muppidi, S., Bandali, T., Eulalio, T. Y., Chen, W., Zhou, X., Ding, Y., Cui, Y., Tan, Y., Liu, Y., Shah, N., Daneshjou, R. 2025; 8 (1): 149

    Abstract

    Red teaming, the practice of adversarially exposing unexpected or undesired model behaviors, is critical towards improving equity and accuracy of large language models, but non-model creator-affiliated red teaming is scant in healthcare. We convened teams of clinicians, medical and engineering students, and technical professionals (80 participants total) to stress-test models with real-world clinical cases and categorize inappropriate responses along axes of safety, privacy, hallucinations/accuracy, and bias. Six medically-trained reviewers re-analyzed prompt-response pairs and added qualitative annotations. Of 376 unique prompts (1504 responses), 20.1% were inappropriate (GPT-3.5: 25.8%; GPT-4.0: 16%; GPT-4.0 with Internet: 17.8%). Subsequently, we show the utility of our benchmark by testing GPT-4o, a model released after our event (20.4% inappropriate). 21.5% of responses appropriate with GPT-3.5 were inappropriate in updated models. We share insights for constructing red teaming prompts, and present our benchmark for iterative model assessments.

    View details for DOI 10.1038/s41746-025-01542-0

    View details for PubMedID 40055532

    View details for PubMedCentralID 10564921

  • Unveiling the genetic landscape of hereditary melanoma: From susceptibility to surveillance. Cancer treatment and research communications Zheng, C., Sarin, K. Y. 2024; 40: 100837

    Abstract

    The multifactorial etiology underlying melanoma development involves an array of genetic, phenotypic, and environmental factors. Genetic predisposition for melanoma is further influenced by the complex interplay between high-, medium-, and low-penetrance genes, each contributing to varying degrees of susceptibility. Within this network, high-penetrance genes, including CDKN2A, CDK4, BAP1, and POT1, are linked to a pronounced risk for disease, whereas medium- and low-penetrance genes, such as MC1R, MITF, and others, contribute only moderately to melanoma risk. Notably, these genetic factors not only heighten the risk of melanoma but may also increase susceptibility towards internal malignancies, such as pancreatic cancer, renal cell cancer, or neural tumors. Genetic testing and counseling hold paramount importance in the clinical context of suspected hereditary melanoma, facilitating risk assessment, personalized surveillance strategies, and informed decision-making. As our understanding of the genomic landscape deepens, this review paper aims to comprehensively summarize the genetic underpinnings of hereditary melanoma, as well as current screening and management strategies for the disease.

    View details for DOI 10.1016/j.ctarc.2024.100837

    View details for PubMedID 39137473