John Duchi
Associate Professor of Statistics, of Electrical Engineering and, by courtesy, of Computer Science
Bio
John Duchi (http://web.stanford.edu/~jduchi/) is an assistant professor of Statistics and Electrical Engineering and (by courtesy) Computer Science at Stanford University. His work spans statistical learning, optimization, information theory, and computation, with a few driving goals. (1) To discover statistical learning procedures that optimally trade between resources--computation, communication, privacy provided to study participants--while maintaining good statistical performance. (2) To build efficient large-scale optimization methods that address the spectrum of optimization, machine learning, and data analysis problems we face, allowing us to move beyond bespoke solutions to methods that robustly work. (3) To develop tools to assess and guarantee the validity of--and confidence we should have in--machine-learned systems.
Academic Appointments
-
Associate Professor, Statistics
-
Associate Professor, Electrical Engineering
-
Associate Professor (By courtesy), Computer Science
-
Member, Bio-X
Honors & Awards
-
SIAM/OPT Early Career Prize, Society for Industrial and Applied Mathematics, Optimization Group (2020)
-
Young Investigator Award, Office of Naval Research (2019)
-
Best Paper Award, Neural Information Processing Systems (2017)
-
Sloan Foundation Fellow in Mathematics, Sloan Foundation (2016)
-
Doctoral Dissertation Award (Honorable Mention), Association for Computing Machinery (2015)
-
Okawa Foundation Award, Okawa Foundation (2015)
-
C.V. Ramamoorthy Distinguished Research Award, University of California, Berkeley (2014)
-
Best Student Paper Award, International Conference on Machine Learning (2010)
Professional Education
-
BS, Stanford University, Computer Science (2007)
-
MS, Stanford University, Computer Science (2008)
-
MA, University of California, Berkeley, Statistics (2012)
-
PhD, University of California, Berkeley, EECS (2014)
Current Research and Scholarly Interests
My work spans statistical learning, optimization, information theory, and computation, with a few driving goals: 1. To discover statistical learning procedures that optimally trade between real-world resources while maintaining statistical efficiency. 2. To build efficient large-scale optimization methods that move beyond bespoke solutions to methods that robustly work. 3. To develop tools to assess and guarantee the validity of---and confidence we should have in---machine-learned systems.
2024-25 Courses
- Introduction to Matrix Methods
ENGR 108 (Aut) - Introduction to Probability Theory
STATS 117 (Aut) - Modern Applied Statistics: Learning
STATS 315A (Win) -
Independent Studies (17)
- Advanced Reading and Research
CS 499 (Aut, Win, Spr, Sum) - Advanced Reading and Research
CS 499P (Aut, Win, Spr, Sum) - Curricular Practical Training
CS 390A (Aut, Win, Spr, Sum) - Curricular Practical Training
CS 390B (Aut, Win, Spr, Sum) - Independent Study
DATASCI 199 (Aut, Win, Spr, Sum) - Independent Study
STATS 199 (Aut, Win, Spr, Sum) - Independent Study
STATS 299 (Aut, Win, Spr, Sum) - Industrial Research for Statisticians
STATS 398 (Aut, Win, Spr, Sum) - Master's Thesis and Thesis Research
EE 300 (Aut, Win, Spr, Sum) - Part-time Curricular Practical Training
CS 390D (Aut, Win, Spr, Sum) - Research
STATS 399 (Aut, Win, Spr, Sum) - Special Studies and Reports in Electrical Engineering
EE 191 (Aut, Win, Spr, Sum) - Special Studies and Reports in Electrical Engineering
EE 191A (Aut, Win, Spr) - Special Studies and Reports in Electrical Engineering
EE 391 (Aut, Win, Spr, Sum) - Special Studies and Reports in Electrical Engineering (WIM)
EE 191W (Aut, Win, Spr, Sum) - Special Studies or Projects in Electrical Engineering
EE 190 (Aut, Win, Spr, Sum) - Special Studies or Projects in Electrical Engineering
EE 390 (Aut, Win, Spr, Sum)
- Advanced Reading and Research
-
Prior Year Courses
2023-24 Courses
- Information Theory and Statistics
EE 377, STATS 311 (Aut) - Introduction to Matrix Methods
ENGR 108 (Aut) - Mathematics of Convexity
EE 364M (Win) - Modern Applied Statistics: Learning
STATS 315A (Win)
2022-23 Courses
- Applied Statistics I
STATS 305A (Aut) - Introduction to Matrix Methods
ENGR 108 (Spr)
2021-22 Courses
- Applied Statistics I
STATS 305A (Aut) - Information Theory and Statistics
EE 377, STATS 311 (Aut) - Theory of Probability
STATS 116 (Spr)
- Information Theory and Statistics
Stanford Advisees
-
Doctoral Dissertation Reader (AC)
John Cherian, Anmol Kagrecha, Basil Saeed -
Doctoral Dissertation Advisor (AC)
Felipe Areces, Gary Cheng, Nicole Meister -
Doctoral Dissertation Co-Advisor (AC)
Chen Cheng -
Master's Program Advisor
George Davis, Sixian Du, Haoyi Duan, Aksh Garg, Jake Hofgard, Yiling Huang, Sneha Jayaganthan, Yichen Jiang, Zixin Li, Yichun Qian, John Siddiqui, Yuqiao Zeng -
Doctoral (Program)
Felipe Areces, Andy Dong, Saminul Haque, Rohith Kuditipudi, Christopher Mohri, Pratik Rathore, Alan Yang, Fangzhao Zhang
All Publications
-
THE RIGHT COMPLEXITY MEASURE IN LOCALLY PRIVATE ESTIMATION: IT IS NOT THE FISHER INFORMATION
ANNALS OF STATISTICS
2024; 52 (1): 1-51
View details for DOI 10.1214/22-AOS2227
View details for Web of Science ID 001181807000014
-
Robust Validation: Confident Predictions Even When Distributions Shift
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
2024
View details for DOI 10.1080/01621459.2023.2298037
View details for Web of Science ID 001156001000001
-
Predictive Inference with Weak Supervision
JOURNAL OF MACHINE LEARNING RESEARCH
2024; 25
View details for Web of Science ID 001225467300001
-
BOUNDS ON THE CONDITIONAL AND AVERAGE TREATMENT EFFECT WITH UNOBSERVED CONFOUNDING FACTORS
ANNALS OF STATISTICS
2022; 50 (5): 2587-2615
View details for DOI 10.1214/22-AOS2195
View details for Web of Science ID 000964342400006
-
BOUNDS ON THE CONDITIONAL AND AVERAGE TREATMENT EFFECT WITH UNOBSERVED CONFOUNDING FACTORS.
Annals of statistics
2022; 50 (5): 2587-2615
Abstract
For observational studies, we study the sensitivity of causal inference when treatment assignments may depend on unobserved confounders. We develop a loss minimization approach for estimating bounds on the conditional average treatment effect (CATE) when unobserved confounders have a bounded effect on the odds ratio of treatment selection. Our approach is scalable and allows flexible use of model classes in estimation, including nonparametric and black-box machine learning methods. Based on these bounds for the CATE, we propose a sensitivity analysis for the average treatment effect (ATE). Our semiparametric estimator extends/bounds the augmented inverse propensity weighted (AIPW) estimator for the ATE under bounded unobserved confounding. By constructing a Neyman orthogonal score, our estimator of the bound for the ATE is a regular root-n estimator so long as the nuisance parameters are estimated at the opn-1/4 rate. We complement our methodology with optimality results showing that our proposed bounds are tight in certain cases. We demonstrate our method on simulated and real data examples, and show accurate coverage of our confidence intervals in practical finite sample regimes with rich covariate information.
View details for DOI 10.1214/22-aos2195
View details for PubMedID 38050638
View details for PubMedCentralID PMC10694186
-
Distributionally Robust Losses for Latent Covariate Mixtures
OPERATIONS RESEARCH
2022
View details for DOI 10.1287/opre.2022.2363
View details for Web of Science ID 000851350700001
-
Mean Estimation From One-Bit Measurements
IEEE TRANSACTIONS ON INFORMATION THEORY
2022; 68 (9): 6276-6296
View details for DOI 10.1109/TIT.2022.3174409
View details for Web of Science ID 000843258800039
-
Lower bounds for non-convex stochastic optimization
MATHEMATICAL PROGRAMMING
2022
View details for DOI 10.1007/s10107-022-01822-7
View details for Web of Science ID 000809796800002
-
Private optimization in the interpolation regime: faster rates and hardness results
JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2022
View details for Web of Science ID 000899944901002
-
Accelerated, Optimal and Parallel: Some results on model-based stochastic optimization
JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2022
View details for Web of Science ID 000899944902039
-
Statistics of Robust Optimization: A Generalized Empirical Likelihood Approach
MATHEMATICS OF OPERATIONS RESEARCH
2021; 46 (3): 946-969
View details for DOI 10.1287/moor.2020.1085
View details for Web of Science ID 000686221800006
-
LEARNING MODELS WITH UNIFORM PERFORMANCE VIA DISTRIBUTIONALLY ROBUST OPTIMIZATION
ANNALS OF STATISTICS
2021; 49 (3): 1378-1406
View details for DOI 10.1214/20-AOS2004
View details for Web of Science ID 000684378300006
-
ASYMPTOTIC OPTIMALITY IN STOCHASTIC OPTIMIZATION
ANNALS OF STATISTICS
2021; 49 (1): 21–48
View details for DOI 10.1214/19-AOS1831
View details for Web of Science ID 000614187400002
-
A constrained risk inequality for general losses
MICROTOME PUBLISHING. 2021
View details for Web of Science ID 000659893801003
-
Knowing what You Know: valid and validated confidence sets in multiclass and multilabel prediction
JOURNAL OF MACHINE LEARNING RESEARCH
2021; 22
View details for Web of Science ID 000656399500001
-
Private Adaptive Gradient Methods for Convex Optimization
JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2021
View details for Web of Science ID 000683104600036
-
Misspecification in Prediction Problems and Robustness via Improper Learning
MICROTOME PUBLISHING. 2021
View details for Web of Science ID 000659893802060
-
Lower bounds for finding stationary points II: first-order methods
MATHEMATICAL PROGRAMMING
2021; 185 (1-2): 315–55
View details for DOI 10.1007/s10107-019-01431-x
View details for Web of Science ID 000608923300010
-
Lower bounds for finding stationary points I
MATHEMATICAL PROGRAMMING
2020; 184 (1-2): 71–120
View details for DOI 10.1007/s10107-019-01406-y
View details for Web of Science ID 000578114700003
-
First-Order Methods for Nonconvex Quadratic Minimization
SIAM REVIEW
2020; 62 (2): 395–436
View details for DOI 10.1137/20M1321759
View details for Web of Science ID 000551389300003
-
Solving (most) of a set of quadratic equalities: composite optimization for robust phase retrieval
INFORMATION AND INFERENCE-A JOURNAL OF THE IMA
2019; 8 (3): 471–529
View details for DOI 10.1093/imaiai/iay015
View details for Web of Science ID 000493304200003
-
GRADIENT DESCENT FINDS THE CUBIC-REGULARIZED NONCONVEX NEWTON STEP
SIAM JOURNAL ON OPTIMIZATION
2019; 29 (3): 2146–78
View details for DOI 10.1137/17M1113898
View details for Web of Science ID 000487929500017
-
The importance of better models in stochastic optimization.
Proceedings of the National Academy of Sciences of the United States of America
2019
Abstract
Standard stochastic optimization methods are brittle, sensitive to stepsize choice and other algorithmic parameters, and they exhibit instability outside of well-behaved families of objectives. To address these challenges, we investigate models for stochastic optimization and learning problems that exhibit better robustness to problem families and algorithmic parameters. With appropriately accurate models-which we call the aprox family-stochastic methods can be made stable, provably convergent, and asymptotically optimal; even modeling that the objective is nonnegative is sufficient for this stability. We extend these results beyond convexity to weakly convex objectives, which include compositions of convex losses with smooth functions common in modern machine learning. We highlight the importance of robustness and accurate modeling with experimental evaluation of convergence time and algorithm sensitivity.
View details for DOI 10.1073/pnas.1908018116
View details for PubMedID 31666325
-
Modeling simple structures and geometry for better stochastic optimization algorithms
MICROTOME PUBLISHING. 2019
View details for Web of Science ID 000509687902049
-
Necessary and Sufficient Geometries for Gradient Methods
NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2019
View details for Web of Science ID 000535866903016
-
Unlabeled Data Improves Adversarial Robustness
NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2019
View details for Web of Science ID 000535866902078
-
Variance-based Regularization with Convex Objectives
JOURNAL OF MACHINE LEARNING RESEARCH
2019; 20
View details for Web of Science ID 000467894600001
-
STOCHASTIC (APPROXIMATE) PROXIMAL POINT METHODS: CONVERGENCE, OPTIMALITY, AND ADAPTIVITY
SIAM JOURNAL ON OPTIMIZATION
2019; 29 (3): 2257–90
View details for DOI 10.1137/18M1230323
View details for Web of Science ID 000487929500021
-
Minimax Optimal Procedures for Locally Private Estimation Rejoinder
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
2018; 113 (521): 212–15
View details for DOI 10.1080/01621459.2018.1442611
View details for Web of Science ID 000438960500024
-
STOCHASTIC METHODS FOR COMPOSITE AND WEAKLY CONVEX OPTIMIZATION PROBLEMS
SIAM JOURNAL ON OPTIMIZATION
2018; 28 (4): 3229–59
View details for DOI 10.1137/17M1135086
View details for Web of Science ID 000453750000019
-
Scalable End-to-End Autonomous Vehicle Testing via Rare-event Simulation
NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2018
View details for Web of Science ID 000461852004039
-
Generalizing to Unseen Domains via Adversarial Data Augmentation
NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2018
View details for Web of Science ID 000461823305036
-
Analysis of Krylov Subspace Solutions of Regularized Nonconvex Quadratic Problems
NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2018
View details for Web of Science ID 000461852005030
-
Minimax Optimal Procedures for Locally Private Estimation
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
2018; 113 (521): 182–201
View details for DOI 10.1080/01621459.2017.1389735
View details for Web of Science ID 000438960500018
-
ACCELERATED METHODS FOR NONCONVEX OPTIMIZATION
SIAM JOURNAL ON OPTIMIZATION
2018; 28 (2): 1751–72
View details for DOI 10.1137/17M1114296
View details for Web of Science ID 000436991600031
-
Mean Estimation from Adaptive One-bit Measurements
IEEE. 2017: 1000–1007
View details for Web of Science ID 000428047800137
-
Variance-based Regularization with Convex Objectives
NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2017
View details for Web of Science ID 000452649403004
-
Unsupervised Transformation Learning via Convex Relaxations
NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2017
View details for Web of Science ID 000452649406090
-
Simultaneous dimension reduction and adjustment for confounding variation
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA
2016; 113 (51): 14662-14667
Abstract
Dimension reduction methods are commonly applied to high-throughput biological datasets. However, the results can be hindered by confounding factors, either biological or technical in origin. In this study, we extend principal component analysis (PCA) to propose AC-PCA for simultaneous dimension reduction and adjustment for confounding (AC) variation. We show that AC-PCA can adjust for (i) variations across individual donors present in a human brain exon array dataset and (ii) variations of different species in a model organism ENCODE RNA sequencing dataset. Our approach is able to recover the anatomical structure of neocortical regions and to capture the shared variation among species during embryonic development. For gene selection purposes, we extend AC-PCA with sparsity constraints and propose and implement an efficient algorithm. The methods developed in this paper can also be applied to more general settings. The R package and MATLAB source code are available at https://github.com/linzx06/AC-PCA.
View details for DOI 10.1073/pnas.1617317113
View details for PubMedID 27930330
-
Divide and Conquer Kernel Ridge Regression: A Distributed Algorithm with Minimax Optimal Rates
JOURNAL OF MACHINE LEARNING RESEARCH
2015; 16: 3299-3340
View details for Web of Science ID 000369888000031
-
Optimal Rates for Zero-Order Convex Optimization: The Power of Two Function Evaluations
IEEE TRANSACTIONS ON INFORMATION THEORY
2015; 61 (5): 2788-2806
View details for DOI 10.1109/TIT.2015.2409256
View details for Web of Science ID 000353515700036
-
Dynamic Management of Network Risk from Epidemic Phenomena
IEEE. 2015: 1583–88
View details for Web of Science ID 000381554501121
-
Privacy Aware Learning
JOURNAL OF THE ACM
2014; 61 (6)
View details for DOI 10.1145/2666468
View details for Web of Science ID 000347051200006
-
Privacy: a few definitional aspects and consequences for minimax mean-squared error
IEEE. 2014: 1365–69
View details for Web of Science ID 000370073801085
-
THE ASYMPTOTICS OF RANKING ALGORITHMS
ANNALS OF STATISTICS
2013; 41 (5): 2292-2323
View details for DOI 10.1214/13-AOS1142
View details for Web of Science ID 000327746100002