Mohsen Bayati
Carl and Marilynn Thoma Professor in the Graduate School of Business and Professor, by courtesy, of Electrical Engineering
Operations, Information & Technology
Web page: http://web.stanford.edu/~bayati
Bio
https://web.stanford.edu/~bayati/bio.html
Academic Appointments
-
Professor, Operations, Information & Technology
-
Professor (By courtesy), Electrical Engineering
-
Member, Bio-X
-
Faculty Affiliate, Institute for Human-Centered Artificial Intelligence (HAI)
Honors & Awards
-
National Science Foundation CAREER Award, Stanford University (2016)
-
William Pierskalla best paper award, INFORMS Health Applications Society (2016)
-
Best paper award, INFORMS Applied Probability Society (2015)
-
William Pierskalla best paper award, INFORMS Health Applications Society (2014)
-
Gold Medal, International Mathematics Olympiad (1997)
Current Research and Scholarly Interests
https://web.stanford.edu/~bayati/pub-chron.html
2024-25 Courses
- Business Intelligence from Big Data
OIT 367 (Win) - Data, Learning, and Decision-Making
OIT 604 (Aut) -
Independent Studies (8)
- Biomedical Informatics Teaching Methods
BIOMEDIN 290 (Aut, Win, Spr, Sum) - Directed Reading and Research
BIOMEDIN 299 (Aut, Win, Spr, Sum) - Directed Reading in Education
EDUC 180 (Aut, Win, Spr, Sum) - Doctoral Practicum in Research
OIT 699 (Aut, Win, Spr, Sum) - Doctoral Practicum in Teaching
OIT 698 (Aut, Win, Spr, Sum) - Individual Research
GSBGEN 390 (Aut, Win, Spr) - Medical Scholars Research
BIOMEDIN 370 (Aut, Win, Spr, Sum) - PhD Directed Reading
ACCT 691, FINANCE 691, MGTECON 691, MKTG 691, OB 691, OIT 691, POLECON 691 (Aut, Win, Spr, Sum)
- Biomedical Informatics Teaching Methods
-
Prior Year Courses
2023-24 Courses
- Business Intelligence from Big Data
OIT 367 (Win) - Research in Operations, Information and Technology
OIT 644 (Win)
2022-23 Courses
- Business Intelligence from Big Data
OIT 367 (Win) - Research in Operations, Information and Technology
OIT 644 (Win)
2021-22 Courses
- Business Intelligence from Big Data
OIT 367 (Win)
- Business Intelligence from Big Data
Stanford Advisees
-
Doctoral Dissertation Advisor (AC)
Yuwei Luo -
Doctoral (Program)
Wassim Dhaouadi, William Overman, Marcos Serrano, Mohamad Sadegh Shirani Faradonbeh
Graduate and Fellowship Programs
-
Biomedical Data Science (Phd Program)
All Publications
-
Large language models for preventing medication direction errors in online pharmacies.
Nature medicine
2024
Abstract
Errors in pharmacy medication directions, such as incorrect instructions for dosage or frequency, can increase patient safety risk substantially by raising the chances of adverse drug events. This study explores how integrating domain knowledge with large language models (LLMs)-capable of sophisticated text interpretation and generation-can reduce these errors. We introduce MEDIC (medication direction copilot), a system that emulates the reasoning of pharmacists by prioritizing precise communication of core clinical components of a prescription, such as dosage and frequency. It fine-tunes a first-generation LLM using 1,000 expert-annotated and augmented directions from Amazon Pharmacy to extract the core components and assembles them into complete directions using pharmacy logic and safety guardrails. We compared MEDIC against two LLM-based benchmarks: one leveraging 1.5 million medication directions and the other using state-of-the-art LLMs. On 1,200 expert-reviewed prescriptions, the two benchmarks respectively recorded 1.51 (confidence interval (CI) 1.03, 2.31) and 4.38 (CI 3.13, 6.64) times more near-miss events-errors caught and corrected before reaching the patient-than MEDIC. Additionally, we tested MEDIC by deploying within the production system of an online pharmacy, and during this experimental period, it reduced near-miss events by 33% (CI 26%, 40%). This study shows that LLMs, with domain expertise and safeguards, improve the accuracy and efficiency of pharmacy operations.
View details for DOI 10.1038/s41591-024-02933-8
View details for PubMedID 38664535
-
Predicting Primary Care Physician Burnout From Electronic Health Record Use Measures.
Mayo Clinic proceedings
2024
Abstract
To evaluate the ability of routinely collected electronic health record (EHR) use measures to predict clinical work units at increased risk of burnout and potentially most in need of targeted interventions.In this observational study of primary care physicians, we compiled clinical workload and EHR efficiency measures, then linked these measures to 2 years of well-being surveys (using the Stanford Professional Fulfillment Index) conducted from April 1, 2019, through October 16, 2020. Physicians were grouped into training and confirmation data sets to develop predictive models for burnout. We used gradient boosting classifier and other prediction modeling algorithms to quantify the predictive performance by the area under the receiver operating characteristics curve (AUC).Of 278 invited physicians from across 60 clinics, 233 (84%) completed 396 surveys. Physicians were 67% women with a median age category of 45 to 49 years. Aggregate burnout score was in the high range (≥3.325/10) on 111 of 396 (28%) surveys. Gradient boosting classifier of EHR use measures to predict burnout achieved an AUC of 0.59 (95% CI, 0.48 to 0.77) and an area under the precision-recall curve of 0.29 (95% CI, 0.20 to 0.66). Other models' confirmation set AUCs ranged from 0.56 (random forest) to 0.66 (penalized linear regression followed by dichotomization). Among the most predictive features were physician age, team member contributions to notes, and orders placed with user-defined preferences. Clinic-level aggregate measures identified the top quartile of clinics with 56% sensitivity and 85% specificity.In a sample of primary care physicians, routinely collected EHR use measures demonstrated limited ability to predict individual burnout and moderate ability to identify high-risk clinics.
View details for DOI 10.1016/j.mayocp.2024.01.005
View details for PubMedID 38573301
-
Optimal Experimental Design for Staggered Rollouts
MANAGEMENT SCIENCE
2023
View details for DOI 10.1287/mnsc.2023.4928
View details for Web of Science ID 001126299600001
-
Technical Note-The Elliptical Potential Lemma for General Distributions with an Application to Linear Thompson Sampling
OPERATIONS RESEARCH
2022
View details for DOI 10.1287/opre.2022.2274
View details for Web of Science ID 000836245000001
-
On Low-rank Trace Regression under General Sampling Distribution
JOURNAL OF MACHINE LEARNING RESEARCH
2022; 23
View details for Web of Science ID 001003301100001
-
Frustration With Technology and its Relation to Emotional Exhaustion Among Health Care Workers: Cross-sectional Observational Study.
Journal of medical Internet research
2021; 23 (7): e26817
Abstract
BACKGROUND: New technology adoption is common in health care, but it may elicit frustration if end users are not sufficiently considered in their design or trained in their use. These frustrations may contribute to burnout.OBJECTIVE: This study aimed to evaluate and quantify health care workers' frustration with technology and its relationship with emotional exhaustion, after controlling for measures of work-life integration that may indicate excessive job demands.METHODS: This was a cross-sectional, observational study of health care workers across 31 Michigan hospitals. We used the Safety, Communication, Operational Reliability, and Engagement (SCORE) survey to measure work-life integration and emotional exhaustion among the survey respondents. We used mixed-effects hierarchical linear regression to evaluate the relationship among frustration with technology, other components of work-life integration, and emotional exhaustion, with adjustment for unit and health care worker characteristics.RESULTS: Of 15,505 respondents, 5065 (32.7%) reported that they experienced frustration with technology on at least 3-5 days per week. Frustration with technology was associated with higher scores for the composite Emotional Exhaustion scale (r=0.35, P<.001) and each individual item on the Emotional Exhaustion scale (r=0.29-0.36, P<.001 for all). Each 10-point increase in the frustration with technology score was associated with a 1.2-point increase (95% CI 1.1-1.4) in emotional exhaustion (both measured on 100-point scales), after adjustment for other work-life integration items and unit and health care worker characteristics.CONCLUSIONS: This study found that frustration with technology and several other markers of work-life integration are independently associated with emotional exhaustion among health care workers. Frustration with technology is common but not ubiquitous among health care workers, and it is one of several work-life integration factors associated with emotional exhaustion. Minimizing frustration with health care technology may be an effective approach in reducing burnout among health care workers.
View details for DOI 10.2196/26817
View details for PubMedID 34255674
-
Matrix Completion Methods for Causal Panel Data Models
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
2021
View details for DOI 10.1080/01621459.2021.1891924
View details for Web of Science ID 000648806000001
-
Mostly Exploration-Free Algorithms for Contextual Bandits
MANAGEMENT SCIENCE
2021; 67 (3)
View details for DOI 10.1287/mnsc.2020.3605
View details for Web of Science ID 000632021900001
-
PatientFlowNet: A Deep Learning Approach to Patient Flow Prediction in Emergency Departments
IEEE ACCESS
2021; 9: 45552–61
View details for DOI 10.1109/ACCESS.2021.3066164
View details for Web of Science ID 000634457400001
-
Online Decision Making with High-Dimensional Covariates
OPERATIONS RESEARCH
2020; 68 (1): 276–94
View details for DOI 10.1287/opre.2019.1902
View details for Web of Science ID 000509473400015
-
Recommendation on a Budget: Column Space Recovery from Partially Observed Entries with Random or Active Sampling
ADDISON-WESLEY PUBL CO. 2020: 445–54
View details for Web of Science ID 000559931301072
-
Evidence of Upcoding in Pay-for-Performance Programs
MANAGEMENT SCIENCE
2019; 65 (3): 1042–60
View details for DOI 10.1287/mnsc.2017.2996
View details for Web of Science ID 000461928600005
-
Scalable Approximations for Generalized Linear Problems
JOURNAL OF MACHINE LEARNING RESEARCH
2019; 20
View details for Web of Science ID 000458663200001
-
Personalizing Many Decisions with High-Dimensional Covariates
NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2019
View details for Web of Science ID 000535866903014
- Personalizing Many Decisions with High-Dimensional Covariates Advances in Neural Information Processing Systems 32 2019
-
Generating Random Networks Without Short Cycles
OPERATIONS RESEARCH
2018; 66 (5): 1227–46
View details for DOI 10.1287/opre.2018.1730
View details for Web of Science ID 000446179500004
-
Data Uncertainty in Markov Chains: Application to Cost-Effectiveness Analyses of Medical Innovations
OPERATIONS RESEARCH
2018; 66 (3): 697–715
View details for DOI 10.1287/opre.2017.1685
View details for Web of Science ID 000441553700007
-
Accurate Emergency Department Wait Time Prediction
M&SOM-MANUFACTURING & SERVICE OPERATIONS MANAGEMENT
2016; 18 (1): 141-156
View details for DOI 10.1287/msom.2015.0560
View details for Web of Science ID 000375601500010
-
Scaled Least Squares Estimator for GLMs in Large-Scale Problems
NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2016
View details for Web of Science ID 000458973704047
-
Statistical analysis of a low cost method for multiple disease prediction.
Statistical methods in medical research
2016: 962280216680242-?
Abstract
Early identification of individuals at risk for chronic diseases is of significant clinical value. Early detection provides the opportunity to slow the pace of a condition, and thus help individuals to improve or maintain their quality of life. Additionally, it can lessen the financial burden on health insurers and self-insured employers. As a solution to mitigate the rise in chronic conditions and related costs, an increasing number of employers have recently begun using wellness programs, which typically involve an annual health risk assessment. Unfortunately, these risk assessments have low detection capability, as they should be low-cost and hence rely on collecting relatively few basic biomarkers. Thus one may ask, how can we select a low-cost set of biomarkers that would be the most predictive of multiple chronic diseases? In this paper, we propose a statistical data-driven method to address this challenge by minimizing the number of biomarkers in the screening procedure while maximizing the predictive power over a broad spectrum of diseases. Our solution uses multi-task learning and group dimensionality reduction from machine learning and statistics. We provide empirical validation of the proposed solution using data from two different electronic medical records systems, with comparisons over a statistical benchmark.
View details for DOI 10.1177/0962280216680242
View details for PubMedID 27932665
-
Active Postmarketing Drug Surveillance for Multiple Adverse Events
OPERATIONS RESEARCH
2015; 63 (6): 1528-1546
View details for DOI 10.1287/opre.2015.1435
View details for Web of Science ID 000367833500019
-
UNIVERSALITY IN POLYTOPE PHASE TRANSITIONS AND MESSAGE PASSING ALGORITHMS
ANNALS OF APPLIED PROBABILITY
2015; 25 (2): 753-822
View details for DOI 10.1214/14-AAP1010
View details for Web of Science ID 000350708000012
-
Bargaining dynamics in exchange networks
JOURNAL OF ECONOMIC THEORY
2015; 156: 417-454
View details for DOI 10.1016/j.jet.2014.02.007
View details for Web of Science ID 000349728700014
-
A Low-Cost Method for Multiple Disease Prediction.
AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium
2015; 2015: 329-338
Abstract
Recently, in response to the rising costs of healthcare services, employers that are financially responsible for the healthcare costs of their workforce have been investing in health improvement programs for their employees. A main objective of these so called "wellness programs" is to reduce the incidence of chronic illnesses such as cardiovascular disease, cancer, diabetes, and obesity, with the goal of reducing future medical costs. The majority of these wellness programs include an annual screening to detect individuals with the highest risk of developing chronic disease. Once these individuals are identified, the company can invest in interventions to reduce the risk of those individuals. However, capturing many biomarkers per employee creates a costly screening procedure. We propose a statistical data-driven method to address this challenge by minimizing the number of biomarkers in the screening procedure while maximizing the predictive power over a broad spectrum of diseases. Our solution uses multi-task learning and group dimensionality reduction from machine learning and statistics. We provide empirical validation of the proposed solution using data from two different electronic medical records systems, with comparisons to a statistical benchmark.
View details for PubMedID 26958164
-
Identifying Patients at High Risk for Readmission following Treatment for Acute Myocardial Infarction: a Data-Centric Approach
LIPPINCOTT WILLIAMS & WILKINS. 2014
View details for Web of Science ID 000209790203144
-
Data-driven decisions for reducing readmissions for heart failure: general methodology and case study.
PloS one
2014; 9 (10): e109264
Abstract
Several studies have focused on stratifying patients according to their level of readmission risk, fueled in part by incentive programs in the U.S. that link readmission rates to the annual payment update by Medicare. Patient-specific predictions about readmission have not seen widespread use because of their limited accuracy and questions about the efficacy of using measures of risk to guide clinical decisions. We construct a predictive model for readmissions for congestive heart failure (CHF) and study how its predictions can be used to perform patient-specific interventions. We assess the cost-effectiveness of a methodology that combines prediction and decision making to allocate interventions. The results highlight the importance of combining predictions with decision analysis.We construct a statistical classifier from a retrospective database of 793 hospital visits for heart failure that predicts the likelihood that patients will be rehospitalized within 30 days of discharge. We introduce a decision analysis that uses the predictions to guide decisions about post-discharge interventions. We perform a cost-effectiveness analysis of 379 additional hospital visits that were not included in either the formulation of the classifiers or the decision analysis. We report the performance of the methodology and show the overall expected value of employing a real-time decision system.For the cohort studied, readmissions are associated with a mean cost of $13,679 with a standard error of $1,214. Given a post-discharge plan that costs $1,300 and that reduces 30-day rehospitalizations by 35%, use of the proposed methods would provide an 18.2% reduction in rehospitalizations and save 3.8% of costs.Classifiers learned automatically from patient data can be joined with decision analysis to guide the allocation of post-discharge support to CHF patients. Such analyses are especially valuable in the common situation where it is not economically feasible to provide programs to all patients.
View details for DOI 10.1371/journal.pone.0109264
View details for PubMedID 25295524
View details for PubMedCentralID PMC4190088
-
COMBINATORIAL APPROACH TO THE INTERPOLATION METHOD AND SCALING LIMITS IN SPARSE RANDOM GRAPHS
ANNALS OF PROBABILITY
2013; 41 (6): 4080-4115
View details for DOI 10.1214/12-AOP816
View details for Web of Science ID 000328255600008
-
Message-Passing Algorithms for Sparse Network Alignment
ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA
2013; 7 (1)
View details for DOI 10.1145/2435209.2435212
View details for Web of Science ID 000316415700003
- Estimating LASSO Risk and Noise Level Neural Information Processing Systems 26 2013
-
The LASSO Risk for Gaussian Matrices
IEEE TRANSACTIONS ON INFORMATION THEORY
2012; 58 (4): 1997-2017
View details for DOI 10.1109/TIT.2011.2174612
View details for Web of Science ID 000302079800001
-
The Dynamics of Message Passing on Dense Graphs, with Applications to Compressed Sensing
IEEE International Symposium on Information Theory
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC. 2011: 764–85
View details for DOI 10.1109/TIT.2010.2094817
View details for Web of Science ID 000286514200014
-
BELIEF PROPAGATION FOR WEIGHTED b-MATCHINGS ON ARBITRARY GRAPHS AND ITS RELATION TO LINEAR PROGRAMS WITH INTEGER SOLUTIONS
SIAM JOURNAL ON DISCRETE MATHEMATICS
2011; 25 (2): 989-1011
View details for DOI 10.1137/090753115
View details for Web of Science ID 000292302000033
-
A Sequential Algorithm for Generating Random Graphs
ALGORITHMICA
2010; 58 (4): 860-910
View details for DOI 10.1007/s00453-009-9340-1
View details for Web of Science ID 000282089900003
-
A rigorous analysis of the cavity equations for the minimum spanning tree
JOURNAL OF MATHEMATICAL PHYSICS
2008; 49 (12)
View details for DOI 10.1063/1.2982805
View details for Web of Science ID 000262225000007
-
Statistical mechanics of Steiner trees
PHYSICAL REVIEW LETTERS
2008; 101 (3): 037208
Abstract
The minimum weight Steiner tree (MST) is an important combinatorial optimization problem over networks that has applications in a wide range of fields. Here we discuss a general technique to translate the imposed global connectivity constrain into many local ones that can be analyzed with cavity equation techniques. This approach leads to a new optimization algorithm for MST and allows us to analyze the statistical mechanics properties of MST on random graphs of various types.
View details for DOI 10.1103/PhysRevLett.101.037208
View details for Web of Science ID 000258184500053
View details for PubMedID 18764290
-
On the exactness of the cavity method for weighted b-matchings on arbitrary graphs and its relation to linear programs
JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT
2008
View details for DOI 10.1088/1742-5468/2008/06/L06001
View details for Web of Science ID 000257339300001
-
Max-product for maximum weight matching: Convergence, correctness, and LP duality
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC. 2008: 1241–51
View details for DOI 10.1109/TIT.2007.915695
View details for Web of Science ID 000253602200019
-
Simple Deterministic Approximation Algorithms for Counting Matchings
STOC 07: PROCEEDINGS OF THE 39TH ANNUAL ACM SYMPOSIUM ON THEORY OF COMPUTING
2007: 122-127
View details for Web of Science ID 000267050000014
-
Iterative scheduling algorithms
26th IEEE Conference on Computer Communications (INFOCOM 2007)
IEEE. 2007: 445–453
View details for Web of Science ID 000249117700051
-
A simpler max-product Maximum Weight Matching algorithm and the auction algorithm
IEEE International Symposium on Information Theory
IEEE. 2006: 557–561
View details for Web of Science ID 000245289701016
-
Maximum weight matching via max-product belief propagation
IEEE International Symposium on Information Theory and Its Applications
IEEE. 2005: 1763–1767
View details for Web of Science ID 000234713801117
-
Achieving stability in networks of input-queued switches using a local online scheduling policy
GLOBECOM '05: IEEE GLOBAL TELECOMMUNICATIONS CONFERENCE, VOLS 1-6
2005: 694-698
View details for Web of Science ID 000234989601013