Madeleine Udell is Assistant Professor of Management Science and Engineering at Stanford University, with an affiliation with the Institute for Computational and Mathematical Engineering (ICME) and courtesy appointment in Electrical Engineering, and Associate Professor with tenure (on leave) of Operations Research and Information Engineering and Richard and Sybil Smith Sesquicentennial Fellow at Cornell University.
She completed her PhD at Stanford in Computational and Mathematical Engineering and a postdoctoral fellowship at the Center for the Mathematics of Information at Caltech.
Her research aims to accelerate and simplify large-scale data analysis and optimization,
with impact on challenges in healthcare, finance, marketing, operations, and engineering systems design, among others.
Her work in optimization seeks to detect and exploit novel structures,
leading to faster and more memory-efficient algorithms,
automatic proofs of optimality, better complexity guarantees,
and user-friendly optimization solvers and modeling languages.
Her work in machine learning centers on challenges of data preprocessing, interpretability, and causality,
which are critical to practical application in domains with messy data.
Her awards include the Kavli Fellowship (2023), Alfred P. Sloan Research Fellowship (2021), an NSF CAREER award (2020), and an ONR Young Investigator Award (2020).
BS, Yale University, Mathematics and Physics (2009)
PhD, Stanford University, Computational and Mathematical Engineering (2015)
Current Research and Scholarly Interests
Professor Udell develops new techniques to accelerate and automate data science,
with a focus on large-scale optimization and on data preprocessing,
and with applications in medical informatics, engineering system design, and automated machine learning.
- Sparse Data Reconstruction, Missing Value and Multiple Imputation through Matrix Factorization SOCIOLOGICAL METHODOLOGY 2023; 53 (1): 72-114
- RANDOMIZED NYSTROM PRECONDITIONING SIAM JOURNAL ON MATRIX ANALYSIS AND APPLICATIONS 2023; 44 (2): 718-752
- A strict complementarity approach to error bound and sensitivity of solution of conic programs OPTIMIZATION LETTERS 2022
NysADMM: faster composite convex optimization via low-rank approximation
JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2022
View details for Web of Science ID 000900130208002
Online Missing Value Imputation and Change Point Detection with the Gaussian Copula
ASSOC ADVANCEMENT ARTIFICIAL INTELLIGENCE. 2022: 9199-9207
View details for Web of Science ID 000893639102025
- CONTROLBURN: Feature Selection by Sparse Forests ASSOC COMPUTING MACHINERY. 2021: 1045-1054
- Scalable Semidefinite Programming SIAM JOURNAL ON MATHEMATICS OF DATA SCIENCE 2021; 3 (1): 171-200
- Robust Non-Linear Matrix Factorization for Dictionary Learning, Denoising, and Clustering IEEE TRANSACTIONS ON SIGNAL PROCESSING 2021; 69: 1755-1770
- RANDOMIZED SKETCHING ALGORITHMS FOR LOW-MEMORY DYNAMIC OPTIMIZATION SIAM JOURNAL ON OPTIMIZATION 2021; 31 (2): 1242-1275
TenIPS: Inverse Propensity Sampling for Tensor Completion
MICROTOME PUBLISHING. 2021
View details for Web of Science ID 000659893803078
- ON THE SIMPLICITY AND CONDITIONING OF LOW RANK SEMIDEFINITE PROGRAMS SIAM JOURNAL ON OPTIMIZATION 2021; 31 (4): 2614-2637
- An Optimal-Storage Approach to Semidefinite Programming Using Approximate Complementarity SIAM JOURNAL ON OPTIMIZATION 2021; 31 (4): 2695-2725
- Dynamic Assortment Personalization in High Dimensions OPERATIONS RESEARCH 2020; 68 (4): 1020-1037
- Low-Rank Tucker Approximation of a Tensor from Streaming Data SIAM JOURNAL ON MATHEMATICS OF DATA SCIENCE 2020; 2 (4): 1123-1150
- AutoML Pipeline Selection: Efficiently Navigating the Combinatorial Space ASSOC COMPUTING MACHINERY. 2020: 1446-1456
- Missing Value Imputation for Mixed Data via Gaussian Copula ASSOC COMPUTING MACHINERY. 2020: 636-646
Polynomial Matrix Completion for Missing Data Imputation and Transductive Learning
ASSOC ADVANCEMENT ARTIFICIAL INTELLIGENCE. 2020: 3842-3849
View details for Web of Science ID 000667722803112
- Optimal Design of Efficient Rooftop Photovoltaic Arrays INFORMS JOURNAL ON APPLIED ANALYTICS 2019; 49 (4): 281-294
Optimal Design of Efficient Rooftop Photovoltaic Arrays
INFORMS JOURNAL ON APPLIED ANALYTICS
2019; 49 (4): 293-294
View details for Web of Science ID 000483384100011
Factor Group-Sparse Regularization for Efficient Low-Rank Matrix Recovery
NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2019
View details for Web of Science ID 000534424305014
- Why Are Big Data Matrices Approximately Low Rank? SIAM JOURNAL ON MATHEMATICS OF DATA SCIENCE 2019; 1 (1): 144-160
- Fairness Under Unawareness: Assessing Disparity When Protected Class Is Unobserved ASSOC COMPUTING MACHINERY. 2019: 339-348
- STREAMING LOW-RANK MATRIX APPROXIMATION WITH AN APPLICATION TO SCIENTIFIC SIMULATION SIAM JOURNAL ON SCIENTIFIC COMPUTING 2019; 41 (4): A2430-A2463
- OBOE: Collaborative Filtering for AutoML Model Selection ASSOC COMPUTING MACHINERY. 2019: 1173-1183
- Online high rank matrix completion IEEE. 2019: 8682-8690
Causal Inference with Noisy and Missing Covariates via Matrix Factorization
NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2018
View details for Web of Science ID 000461852001046
- Frank-Wolfe Style Algorithms for Large Scale Optimization LARGE-SCALE AND DISTRIBUTED OPTIMIZATION 2018; 2227: 215-245
Limited memory Kelley's Method Converges for Composite Convex and Submodular Objectives
NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2018
View details for Web of Science ID 000461823304043
Disciplined Multi-Convex Programming
IEEE. 2017: 895–900
View details for Web of Science ID 000427082203091
- Graph-Regularized Generalized Low-Rank Models IEEE. 2017: 1921-1926
- PRACTICAL SKETCHING ALGORITHMS FOR LOW-RANK MATRIX APPROXIMATION SIAM JOURNAL ON MATRIX ANALYSIS AND APPLICATIONS 2017; 38 (4): 1454-1485
Sketchy Decisions: Convex Low-Rank Matrix Optimization with Optimal Storage
MICROTOME PUBLISHING. 2017: 1188-1196
View details for Web of Science ID 000509368500127
Fixed-Rank Approximation of a Positive-Semidefinite Matrix from Streaming Data
NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2017
View details for Web of Science ID 000452649401026
- Bounding duality gap for separable problems with linear constraints COMPUTATIONAL OPTIMIZATION AND APPLICATIONS 2016; 64 (2): 355-378
DISCOVERING PATIENT PHENOTYPES USING GENERALIZED LOW RANK MODELS.
Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing
2016; 21: 144-155
The practice of medicine is predicated on discovering commonalities or distinguishing characteristics among patients to inform corresponding treatment. Given a patient grouping (hereafter referred to as a phenotype), clinicians can implement a treatment pathway accounting for the underlying cause of disease in that phenotype. Traditionally, phenotypes have been discovered by intuition, experience in practice, and advancements in basic science, but these approaches are often heuristic, labor intensive, and can take decades to produce actionable knowledge. Although our understanding of disease has progressed substantially in the past century, there are still important domains in which our phenotypes are murky, such as in behavioral health or in hospital settings. To accelerate phenotype discovery, researchers have used machine learning to find patterns in electronic health records, but have often been thwarted by missing data, sparsity, and data heterogeneity. In this study, we use a flexible framework called Generalized Low Rank Modeling (GLRM) to overcome these barriers and discover phenotypes in two sources of patient data. First, we analyze data from the 2010 Healthcare Cost and Utilization Project National Inpatient Sample (NIS), which contains upwards of 8 million hospitalization records consisting of administrative codes and demographic information. Second, we analyze a small (N=1746), local dataset documenting the clinical progression of autism spectrum disorder patients using granular features from the electronic health record, including text from physician notes. We demonstrate that low rank modeling successfully captures known and putative phenotypes in these vastly different datasets.
View details for PubMedID 26776181
- Revealed Preference at Scale: Learning Personalized Preferences from Assortment Choices ASSOC COMPUTING MACHINERY. 2016: 821-837
The Sound of APALM Clapping: Faster Nonsmooth Nonconvex Optimization with Stochastic Asynchronous PALM
NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2016
View details for Web of Science ID 000458973703064
- Introduction FOUNDATIONS AND TRENDS IN MACHINE LEARNING 2016; 9 (1): 2-+
FACTORIZATION FOR ANALOG-TO-DIGITAL MATRIX MULTIPLICATION
IEEE. 2015: 1061–65
View details for Web of Science ID 000427402901036
- Revenue Maximization for Broadband Service Providers Using Revenue Capacity IEEE. 2015
Incorporation of flexible objectives and time-linked simulation with flux balance analysis.
Journal of theoretical biology
2014; 345: 12-21
We present two modifications of the flux balance analysis (FBA) metabolic modeling framework which relax implicit assumptions of the biomass reaction. Our flexible flux balance analysis (flexFBA) objective removes the fixed proportion between reactants, and can therefore produce a subset of biomass reactants. Our time-linked flux balance analysis (tFBA) simulation removes the fixed proportion between reactants and byproducts, and can therefore describe transitions between metabolic steady states. Used together, flexFBA and tFBA model a time scale shorter than the regulatory and growth steady state encoded by the biomass reaction. This combined short-time FBA method is intended for integrated modeling applications to enable detailed and dynamic depictions of microbial physiology such as whole-cell modeling. For example, when modeling Escherichia coli, it avoids artifacts caused by low-copy-number enzymes in single-cell models with kinetic bounds. Even outside integrated modeling contexts, the detailed predictions of flexFBA and tFBA complement existing FBA techniques. We show detailed metabolite production of in silico knockouts used to identify when correct essentiality predictions are made for the wrong reason.
View details for DOI 10.1016/j.jtbi.2013.12.009
View details for PubMedID 24361328
View details for PubMedCentralID PMC3933926
Analyzing patterns of drug use in clinical notes for patient safety.
AMIA Summits on Translational Science proceedings AMIA Summit on Translational Science
2012; 2012: 63-70
Doctors prescribe drugs for indications that are not FDA approved. Research indicates that 21% of prescriptions filled are for off-label indications. Of those, more than 73% lack supporting scientific evidence. Traditional drug safety alerts may not cover usages that are not FDA approved. Therefore, analyzing patterns of off-label drug usage in the clinical setting is an important step toward reducing the incidence of adverse events and for improving patient safety. We applied term extraction tools on the clinical notes of a million patients to compile a database of statistically significant patterns of drug use. We validated some of the usage patterns learned from the data against sources of known on-label and off-label use. Given our ability to quantify adverse event risks using the clinical notes, this will enable us to address patient safety because we can now rank-order off-label drug use and prioritize the search for their adverse event profiles.
View details for PubMedID 22779054