Emmanuel Candes
Barnum-Simons Chair of Math and Statistics, and Professor of Statistics and, by courtesy, of Electrical Engineering
Mathematics
Web page: http://www-stat.stanford.edu/~candes
Bio
Emmanuel Candès is the Barnum-Simons Chair in Mathematics and Statistics, a professor of electrical engineering (by courtesy) and a member of the Institute of Computational and Mathematical Engineering at Stanford University. Earlier, Candès was the Ronald and Maxine Linde Professor of Applied and Computational Mathematics at the California Institute of Technology. His research interests are in computational harmonic analysis, statistics, information theory, signal processing and mathematical optimization with applications to the imaging sciences, scientific computing and inverse problems. He received his Ph.D. in statistics from Stanford University in 1998.
Candès has received several awards including the Alan T. Waterman Award from NSF, which is the highest honor bestowed by the National Science Foundation, and which recognizes the achievements of early-career scientists. He has given over 60 plenary lectures at major international conferences, not only in mathematics and statistics but in many other areas as well including biomedical imaging and solid-state physics. He was elected to the National Academy of Sciences and to the American Academy of Arts and Sciences in 2014.
Academic Appointments
-
Professor, Mathematics
-
Professor, Statistics
-
Professor (By courtesy), Electrical Engineering
-
Member, Bio-X
-
Faculty Director, Stanford Data Science
Administrative Appointments
-
Chair, Department of Statistics, Stanford University (2015 - 2018)
Honors & Awards
-
Jack S. Kilby Signal Processing Medal, IEEE (2021)
-
Princess of Asturias Award for Technical & Scientific Research, Foundation Princess of Asturias (2020)
-
IEEE Signal Processing Society Best Paper Award, Signal Processing Magazine, IEEE (2019)
-
Information Theory Society Paper Award, IEEE (2019)
-
Fellow, IEEE (2018)
-
Fellow, American Mathematical Society (AMS) (2018)
-
Fellow, Society for Industrial and Applied Mathematics (SIAM) (2017)
-
MacArthur Fellow, MacArthur Foundation (2017)
-
Ralph E. Kleinman Prize, Society for Industrial and Applied Mathematics (SIAM) (2017)
-
Wald Memorial Lectures, Institute of Mathematical Statistics (2017)
-
Prix Pierre Simon de Laplace, Société Française de Statistique (2016)
-
Beal-Orchard-Hays Prize, Mathematical Optimization Society (2015)
-
George David Birkhoff Prize, American Mathematical Society (AMS) & Society for Industrial and Applied Mathematics (SIAM) (2015)
-
Fellow, American Academy of Arts and Sciences (2014)
-
Invited Plenary Address at ICM 2014, International Mathematical Union (2014)
-
Member, National Academy of Sciences (2014)
-
Outstanding Paper Prize, Society for Industrial and Applied Mathematics (SIAM) (2014)
-
Prix Jean Kuntzmann, Laboratoire Jean Kuntzmann and PERSYVAL-lab (2014)
-
Dannie Heineman Prize, Academy of Sciences at Göttingen (2013)
-
Lagrange Prize in Continuous Optimization, Mathematical Optimization Society (MOS) and Society of Industrial and Applied Mathematics (SIAM) (2012)
-
Collatz Prize, International Council for Industrial and Applied Mathematics (ICIAM) (2011)
-
Simons Chair, Math + X, Simons Foundation (2011)
-
George Pólya Prize, Society of Industrial and Applied Mathematics (SIAM) (2010)
-
Information Theory Society Paper Award, Information Theory Society (2008)
-
Alan T. Waterman Medal, National Science Foundation (2006)
-
James H. Wilkinson Prize in Numerical Analysis and Scientific Computing, Society of Industrial and Applied Mathematics (SIAM) (2005)
-
Best Paper Award, European Association for Signal, Speech and Image Processing (2003)
-
Young Investigator Award, Department of Energy (2002)
-
Sloan Research Fellow, Alfred P. Sloan Foundation (2001-2003)
-
Third Popov Prize in Approximation Theory, Popov Foundation (2001)
-
National Scholarship, Ecole Polytechnique (1990)
Professional Education
-
PhD, Stanford University, Statistics (1998)
-
Diplome Ingenieur, Ecole Polytechnique (1993)
2024-25 Courses
- Applied Matrix Theory
MATH 104 (Aut) - Theory of Statistics III
STATS 300C (Spr) -
Independent Studies (6)
- Advanced Reading and Research
MATH 360 (Aut, Win, Spr, Sum) - Independent Study
STATS 299 (Aut, Win, Spr, Sum) - Industrial Research for Statisticians
STATS 398 (Aut, Win, Spr, Sum) - Ph.D. Research
CME 400 (Aut, Win, Spr, Sum) - Research
STATS 399 (Aut, Win, Spr, Sum) - Senior Honors Thesis
MATH 197 (Aut, Win)
- Advanced Reading and Research
-
Prior Year Courses
2023-24 Courses
- Applied Matrix Theory
MATH 104 (Win) - Theory of Statistics III
STATS 300C (Spr)
2022-23 Courses
- Applied Matrix Theory
MATH 104 (Win) - Theory of Statistics III
STATS 300C (Spr)
2021-22 Courses
- Applied Matrix Theory
MATH 104 (Win) - Theory of Statistics III
STATS 300C (Spr)
- Applied Matrix Theory
Stanford Advisees
-
Doctoral Dissertation Reader (AC)
Paula Gablenz, Will Hartog, Parth Nobel, Ran Xie, Rui Yan -
Postdoctoral Faculty Sponsor
Jinzhou Li, Yao Zhang, Tijana Zrnic -
Doctoral Dissertation Advisor (AC)
Zhaomeng Chen, John Cherian, Yash Nair, Ziang Song, Asher Spector -
Doctoral Dissertation Co-Advisor (AC)
Michael Salerno, Zitong Yang
All Publications
-
De Finetti's theorem and related results for infinite weighted exchangeable sequences
BERNOULLI
2024; 30 (4): 3004-3028
View details for DOI 10.3150/23-BEJ1704
View details for Web of Science ID 001284717300019
-
Second-order group knockoffs with applications to GWAS.
Bioinformatics (Oxford, England)
2024
Abstract
Conditional testing via the knockoff framework allows one to identify-among large number of possible explanatory variables-those that carry unique information about an outcome of interest, and also provides a false discovery rate guarantee on the selection. This approach is particularly well suited to the analysis of genome wide association studies (GWAS), which have the goal of identifying genetic variants which influence traits of medical relevance.While conditional testing can be both more powerful and precise than traditional GWAS analysis methods, its vanilla implementation encounters a difficulty common to all multivariate analysis methods: it is challenging to distinguish among multiple, highly correlated regressors. This impasse can be overcome by shifting the object of inference from single variables to groups of correlated variables. To achieve this, it is necessary to construct ''group knockoffs." While successful examples are already documented in the literature, this paper substantially expands the set of algorithms and software for group knockoffs. We focus in particular on second-order knockoffs, for which we describe correlation matrix approximations that are appropriate for GWAS data and that result in considerable computational savings. We illustrate the effectiveness of the proposed methods with simulations and with the analysis of albuminuria data from the UK Biobank.The described algorithms are implemented in an open-source Julia package Knockoffs.jl. R and Python wrappers are available as knockoffsr and knockoffspy packages.Supplementary data are available from Bioinformatics online.
View details for DOI 10.1093/bioinformatics/btae580
View details for PubMedID 39340798
-
Cross-prediction-powered inference.
Proceedings of the National Academy of Sciences of the United States of America
2024; 121 (15): e2322083121
Abstract
While reliable data-driven decision-making hinges on high-quality labeled data, the acquisition of quality labels often involves laborious human annotations or slow and expensive scientific measurements. Machine learning is becoming an appealing alternative as sophisticated predictive techniques are being used to quickly and cheaply produce large amounts of predicted labels; e.g., predicted protein structures are used to supplement experimentally derived structures, predictions of socioeconomic indicators from satellite imagery are used to supplement accurate survey data, and so on. Since predictions are imperfect and potentially biased, this practice brings into question the validity of downstream inferences. We introduce cross-prediction: a method for valid inference powered by machine learning. With a small labeled dataset and a large unlabeled dataset, cross-prediction imputes the missing labels via machine learning and applies a form of debiasing to remedy the prediction inaccuracies. The resulting inferences achieve the desired error probability and are more powerful than those that only leverage the labeled data. Closely related is the recent proposal of prediction-powered inference [A. N. Angelopoulos, S. Bates, C. Fannjiang, M. I. Jordan, T. Zrnic, Science 382, 669-674 (2023)], which assumes that a good pretrained model is already available. We show that cross-prediction is consistently more powerful than an adaptation of prediction-powered inference in which a fraction of the labeled data is split off and used to train the model. Finally, we observe that cross-prediction gives more stable conclusions than its competitors; its CIs typically have significantly lower variability.
View details for DOI 10.1073/pnas.2322083121
View details for PubMedID 38568975
-
In silico identification of putative causal genetic variants.
bioRxiv : the preprint server for biology
2024
Abstract
Understanding the causal genetic architecture of complex phenotypes is essential for future research into disease mechanisms and potential therapies. Despite the widespread availability of genome-wide data, existing methods to analyze genetic data still primarily focus on marginal association models, which fall short of fully capturing the polygenic nature of complex traits and elucidating biological causal mechanisms. Here we present a computationally efficient causal inference framework for genome-wide detection of putative causal variants underlying genetic associations. Our approach utilizes summary statistics from potentially overlapping studies as input, constructs in silico knockoff copies of summary statistics as negative controls to attenuate confounding effects induced by linkage disequilibrium, and employs efficient ultrahigh-dimensional sparse regression to jointly model all genetic variants across the genome. Our method is computationally efficient, requiring less than 15 minutes on a single CPU to analyze genome-wide summary statistics. In applications to a meta-analysis of ten large-scale genetic studies of Alzheimer's disease (AD) we identified 82 loci associated with AD, including 37 additional loci missed by conventional GWAS pipeline via marginal association testing. The identified putative causal variants achieve state-of-the-art agreement with massively parallel reporter assays and CRISPR-Cas9 experiments. Additionally, we applied the method to a retrospective analysis of large-scale genome-wide association studies (GWAS) summary statistics from 2013 to 2022. Results reveal the method's capacity to robustly discover additional loci for polygenic traits beyond conventional GWAS and pinpoint potential causal variants underpinning each locus (on average, 22.7% more loci and 78.7% fewer proxy variants), contributing to a deeper understanding of complex genetic architectures in post-GWAS analyses. We are making the discoveries and software freely available to the community and anticipate that routine end-to-end in silico identification of putative causal genetic variants will become an important tool that will facilitate downstream functional experiments and future research into disease etiology, as well as the exploration of novel therapeutic avenues.
View details for DOI 10.1101/2024.02.28.582621
View details for PubMedID 38464202
-
Controlled Variable Selection from Summary Statistics Only? A Solution via GhostKnockoffs and Penalized Regression.
ArXiv
2024
Abstract
Identifying which variables do influence a response while controlling false positives pervades statistics and data science. In this paper, we consider a scenario in which we only have access to summary statistics, such as the values of marginal empirical correlations between each dependent variable of potential interest and the response. This situation may arise due to privacy concerns, e.g., to avoid the release of sensitive genetic information. We extend GhostKnockoffs He et al. [2022] and introduce variable selection methods based on penalized regression achieving false discovery rate (FDR) control. We report empirical results in extensive simulation studies, demonstrating enhanced performance over previous work. We also apply our methods to genome-wide association studies of Alzheimer's disease, and evidence a significant improvement in power.
View details for PubMedID 38463500
-
Statistical Inference for Fairness Auditing
JOURNAL OF MACHINE LEARNING RESEARCH
2024; 25
View details for Web of Science ID 001239373300001
-
Conformal Inference for Online Prediction with Arbitrary Distribution Shifts
JOURNAL OF MACHINE LEARNING RESEARCH
2024; 25: 1-36
View details for DOI 10.1007/s10994-016-5592-6
View details for Web of Science ID 001311336500001
-
Permutation Tests Using Arbitrary Permutation Distributions
SANKHYA-SERIES A-MATHEMATICAL STATISTICS AND PROBABILITY
2023; 85 (2): 1156-1177
View details for Web of Science ID 001157672100003
-
Permutation Tests Using Arbitrary Permutation Distributions
SANKHYA-SERIES A-MATHEMATICAL STATISTICS AND PROBABILITY
2023
View details for DOI 10.1007/s13171-023-00308-8
View details for Web of Science ID 000998743100001
-
KNOCKOFFS WITH SIDE INFORMATION
ANNALS OF APPLIED STATISTICS
2023; 17 (2): 1152-1174
View details for DOI 10.1214/22-AOAS1663
View details for Web of Science ID 000985804300011
-
A POWER ANALYSIS FOR MODEL-X KNOCKOFFS WITH fp-REGULARIZED STATISTICS
ANNALS OF STATISTICS
2023; 51 (3): 1005-1029
View details for DOI 10.1214/23-AOS2274
View details for Web of Science ID 001055382500003
-
CONFORMAL PREDICTION BEYOND EXCHANGEABILITY
ANNALS OF STATISTICS
2023; 51 (2): 816-845
View details for DOI 10.1214/23-AOS2276
View details for Web of Science ID 001022538200017
-
What Ron DeVore Means to Me
CONSTRUCTIVE APPROXIMATION
2023
View details for DOI 10.1007/s00365-023-09634-4
View details for Web of Science ID 000949052300001
-
Sensitivity analysis of individual treatment effects: A robust conformal inference approach.
Proceedings of the National Academy of Sciences of the United States of America
2023; 120 (6): e2214889120
Abstract
We propose a model-free framework for sensitivity analysis of individual treatment effects (ITEs), building upon ideas from conformal inference. For any unit, our procedure reports the Gamma-value, a number which quantifies the minimum strength of confounding needed to explain away the evidence for ITE. Our approach rests on the reliable predictive inference of counterfactuals and ITEs in situations where the training data are confounded. Under the marginal sensitivity model of [Z. Tan, J. Am. Stat. Assoc. 101, 1619-1637 (2006)], we characterize the shift between the distribution of the observations and that of the counterfactuals. We first develop a general method for predictive inference of test samples from a shifted distribution; we then leverage this to construct covariate-dependent prediction sets for counterfactuals. No matter the value of the shift, these prediction sets (resp. approximately) achieve marginal coverage if the propensity score is known exactly (resp. estimated). We describe a distinct procedure also attaining coverage, however, conditional on the training data. In the latter case, we prove a sharpness result showing that for certain classes of prediction problems, the prediction intervals cannot possibly be tightened. We verify the validity and performance of the methods via simulation studies and apply them to analyze real datasets.
View details for DOI 10.1073/pnas.2214889120
View details for PubMedID 36730196
-
Conformalized survival analysis
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY
2023; 85 (1): 24-45
View details for DOI 10.1093/jrsssb/qkac004
View details for Web of Science ID 001076477000002
-
TESTING FOR OUTLIERS WITH CONFORMAL P-VALUES
ANNALS OF STATISTICS
2023; 51 (1): 149-178
View details for DOI 10.1214/22-AOS2244
View details for Web of Science ID 001020041400006
-
Tractable Evaluation of Stein's Unbiased Risk Estimate With Convex Regularizers
IEEE TRANSACTIONS ON SIGNAL PROCESSING
2023; 71: 4330-4341
View details for DOI 10.1109/TSP.2023.3323046
View details for Web of Science ID 001123968900002
-
Conformal PID Control for Time Series Prediction
NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2023
View details for Web of Science ID 001220600004032
-
Uncertainty Quantification over Graph with Conformalized Graph Neural Networks
NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2023
View details for Web of Science ID 001230083402001
-
A Discussion of "A Note on Universal Inference" by Tse and Davison
STAT
2023; 12 (1)
View details for DOI 10.1002/sta4.570
View details for Web of Science ID 000976052200001
-
Selection by Prediction with Conformal p-values
JOURNAL OF MACHINE LEARNING RESEARCH
2023; 24
View details for Web of Science ID 001111602900001
-
GhostKnockoff inference empowers identification of putative causal variants in genome-wide association studies.
Nature communications
2022; 13 (1): 7209
Abstract
Recent advances in genome sequencing and imputation technologies provide an exciting opportunity to comprehensively study the contribution of genetic variants to complex phenotypes. However, our ability to translate genetic discoveries into mechanistic insights remains limited at this point. In this paper, we propose an efficient knockoff-based method, GhostKnockoff, for genome-wide association studies (GWAS) that leads to improved power and ability to prioritize putative causal variants relative to conventional GWAS approaches. The method requires only Z-scores from conventional GWAS and hence can be easily applied to enhance existing and future studies. The method can also be applied to meta-analysis of multiple GWAS allowing for arbitrary sample overlap. We demonstrate its performance using empirical simulations and two applications: (1) a meta-analysis for Alzheimer's disease comprising nine overlapping large-scale GWAS, whole-exome and whole-genome sequencing studies and (2) analysis of 1403 binary phenotypes from the UK Biobank data in 408,961 samples of European ancestry. Our results demonstrate that GhostKnockoff can identify putatively functional variants with weaker statistical effects that are missed by conventional association tests.
View details for DOI 10.1038/s41467-022-34932-z
View details for PubMedID 36418338
-
The asymptotic distribution of the MLE in high-dimensional logistic models: Arbitrary covariance
BERNOULLI
2022; 28 (3): 1835-1861
View details for DOI 10.3150/21-BEJ1401
View details for Web of Science ID 000792361600011
-
Conformal inference of counterfactuals and individual treatment effects
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY
2021
View details for DOI 10.1111/rssb.12445
View details for Web of Science ID 000704320200001
-
False discovery rate control in genome-wide association studies with population structure
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA
2021; 118 (40)
View details for DOI 10.1073/pnas.2105841118|1of12
View details for Web of Science ID 000705930300022
-
Derandomizing Knockoffs
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
2021
View details for DOI 10.1080/01621459.2021.1962720
View details for Web of Science ID 000695975800001
-
Interpretable Classification of Bacterial Raman Spectra with Knockoff Wavelets.
IEEE journal of biomedical and health informatics
2021; PP
Abstract
Deep neural networks and other machine learning models are widely applied to biomedical signal data because they can detect complex patterns and compute accurate predictions. However, the difficulty of interpreting such models is a limitation, especially for applications involving high-stakes decision, including the identification of bacterial infections. This paper considers fast Raman spectroscopy data and demonstrates that a logistic regression model with carefully selected features achieves accuracy comparable to that of neural networks, while being much simpler and more transparent. Our analysis leverages wavelet features with intuitive chemical interpretations, and performs controlled variable selection with knockoffs to ensure the predictors are relevant and non-redundant. Although we focus on a particular data set, the proposed approach is broadly applicable to other types of signal data for which interpretability may be important.
View details for DOI 10.1109/JBHI.2021.3094873
View details for PubMedID 34232897
-
The limits of distribution-free conditional predictive inference
INFORMATION AND INFERENCE-A JOURNAL OF THE IMA
2021; 10 (2): 455-482
View details for DOI 10.1093/imaiai/iaaa017
View details for Web of Science ID 000670949400003
-
PREDICTIVE INFERENCE WITH THE JACKKNIFE
ANNALS OF STATISTICS
2021; 49 (1): 486–507
View details for DOI 10.1214/20-AOS1965
View details for Web of Science ID 000614187400021
-
Distribution-free conditional median inference
ELECTRONIC JOURNAL OF STATISTICS
2021; 15 (2): 4625-4658
View details for DOI 10.1214/21-EJS1910
View details for Web of Science ID 000740666000020
-
False discovery rate control in genome-wide association studies with population structure.
Proceedings of the National Academy of Sciences of the United States of America
2021; 118 (40)
Abstract
We present a comprehensive statistical framework to analyze data from genome-wide association studies of polygenic traits, producing interpretable findings while controlling the false discovery rate. In contrast with standard approaches, our method can leverage sophisticated multivariate algorithms but makes no parametric assumptions about the unknown relation between genotypes and phenotype. Instead, we recognize that genotypes can be considered as a random sample from an appropriate model, encapsulating our knowledge of genetic inheritance and human populations. This allows the generation of imperfect copies (knockoffs) of these variables that serve as ideal negative controls, correcting for linkage disequilibrium and accounting for unknown population structure, which may be due to diverse ancestries or familial relatedness. The validity and effectiveness of our method are demonstrated by extensive simulations and by applications to the UK Biobank data. These analyses confirm our method is powerful relative to state-of-the-art alternatives, while comparisons with other studies validate most of our discoveries. Finally, fast software is made available for researchers to analyze Biobank-scale datasets.
View details for DOI 10.1073/pnas.2105841118
View details for PubMedID 34580220
-
Discussion of the Paper "Prediction, Estimation, and Attribution" by B. Efron
INTERNATIONAL STATISTICAL REVIEW
2020; 88: S60–S63
View details for DOI 10.1111/insr.12412
View details for Web of Science ID 000603161400004
-
ROBUST INFERENCE WITH KNOCKOFFS
ANNALS OF STATISTICS
2020; 48 (3): 1409–31
View details for DOI 10.1214/19-AOS1852
View details for Web of Science ID 000551644000008
-
Discussion of the Paper "Prediction, Estimation, and Attribution" by B. Efron
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
2020; 115 (530): 656–58
View details for DOI 10.1080/01621459.2020.1762618
View details for Web of Science ID 000538423300012
-
Metropolized Knockoff Sampling
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
2020
View details for DOI 10.1080/01621459.2020.1729163
View details for Web of Science ID 000520561600001
-
THE PHASE TRANSITION FOR THE EXISTENCE OF THE MAXIMUM LIKELIHOOD ESTIMATE IN HIGH-DIMENSIONAL LOGISTIC REGRESSION
ANNALS OF STATISTICS
2020; 48 (1): 27–42
View details for DOI 10.1214/18-AOS1789
View details for Web of Science ID 000514816000002
-
Multi-resolution localization of causal variants across the genome.
Nature communications
2020; 11 (1): 1093
Abstract
In the statistical analysis of genome-wide association data, it is challenging to precisely localize the variants that affect complex traits, due to linkage disequilibrium, and to maximize power while limiting spurious findings. Here we report on KnockoffZoom: a flexible method that localizes causal variants at multiple resolutions by testing the conditional associations of genetic segments of decreasing width, while provably controlling the false discovery rate. Our method utilizes artificial genotypes as negative controls and is equally valid for quantitative and binary phenotypes, without requiring any assumptions about their genetic architectures. Instead, we rely on well-established genetic models of linkage disequilibrium. We demonstrate that our method can detect more associations than mixed effects models and achieve fine-mapping precision, at comparable computational cost. Lastly, we apply KnockoffZoom to data from 350k subjects in the UK Biobank and report many new findings.
View details for DOI 10.1038/s41467-020-14791-2
View details for PubMedID 32107378
-
Achieving Equalized Odds by Resampling Sensitive Attributes
NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2020
View details for Web of Science ID 000627697000017
-
Publisher Correction: Multi-resolution localization of causal variants across the genome.
Nature communications
2020; 11 (1): 1799
Abstract
An amendment to this paper has been published and can be accessed via a link at the top of the paper.
View details for DOI 10.1038/s41467-020-15690-2
View details for PubMedID 32265451
-
A comparison of some conformal quantile regression methods
STAT
2020; 9 (1)
View details for DOI 10.1002/sta4.261
View details for Web of Science ID 000614806100006
-
Causal inference in genetic trio studies.
Proceedings of the National Academy of Sciences of the United States of America
2020
Abstract
We introduce a method to draw causal inferences-inferences immune to all possible confounding-from genetic data that include parents and offspring. Causal conclusions are possible with these data because the natural randomness in meiosis can be viewed as a high-dimensional randomized experiment. We make this observation actionable by developing a conditional independence test that identifies regions of the genome containing distinct causal variants. The proposed digital twin test compares an observed offspring to carefully constructed synthetic offspring from the same parents to determine statistical significance, and it can leverage any black-box multivariate model and additional nontrio genetic data to increase power. Crucially, our inferences are based only on a well-established mathematical model of recombination and make no assumptions about the relationship between the genotypes and phenotypes. We compare our method to the widely used transmission disequilibrium test and demonstrate enhanced power and localization.
View details for DOI 10.1073/pnas.2007743117
View details for PubMedID 32948695
-
Deep Knockoffs
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
2019
View details for DOI 10.1080/01621459.2019.1660174
View details for Web of Science ID 000490935600001
-
The likelihood ratio test in high-dimensional logistic regression is asymptotically a rescaled Chi-square
PROBABILITY THEORY AND RELATED FIELDS
2019; 175 (1-2): 487–558
View details for DOI 10.1007/s00440-018-00896-9
View details for Web of Science ID 000487281600012
-
A KNOCKOFF FILTER FOR HIGH-DIMENSIONAL SELECTIVE INFERENCE
ANNALS OF STATISTICS
2019; 47 (5): 2504–37
View details for DOI 10.1214/18-AOS1755
View details for Web of Science ID 000478686900004
-
Holographic phase retrieval and reference design
INVERSE PROBLEMS
2019; 35 (9)
View details for DOI 10.1088/1361-6420/ab23d1
View details for Web of Science ID 000482008200001
-
A modern maximum-likelihood theory for high-dimensional logistic regression.
Proceedings of the National Academy of Sciences of the United States of America
2019
Abstract
Students in statistics or data science usually learn early on that when the sample size n is large relative to the number of variables p, fitting a logistic model by the method of maximum likelihood produces estimates that are consistent and that there are well-known formulas that quantify the variability of these estimates which are used for the purpose of statistical inference. We are often told that these calculations are approximately valid if we have 5 to 10 observations per unknown parameter. This paper shows that this is far from the case, and consequently, inferences produced by common software packages are often unreliable. Consider a logistic model with independent features in which n and p become increasingly large in a fixed ratio. We prove that (i) the maximum-likelihood estimate (MLE) is biased, (ii) the variability of the MLE is far greater than classically estimated, and (iii) the likelihood-ratio test (LRT) is not distributed as a chi2 The bias of the MLE yields wrong predictions for the probability of a case based on observed values of the covariates. We present a theory, which provides explicit expressions for the asymptotic bias and variance of the MLE and the asymptotic distribution of the LRT. We empirically demonstrate that these results are accurate in finite samples. Our results depend only on a single measure of signal strength, which leads to concrete proposals for obtaining accurate inference in finite samples through the estimate of this measure.
View details for DOI 10.1073/pnas.1810420116
View details for PubMedID 31262828
-
On the construction of knockoffs in case-control studies
STAT
2019; 8 (1)
View details for DOI 10.1002/sta4.225
View details for Web of Science ID 000506857900017
-
Dual-Reference Design for Holographic Phase Retrieval
IEEE. 2019
View details for Web of Science ID 000558176800023
-
Conformalized Quantile Regression
NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2019
View details for Web of Science ID 000534424303052
-
Conformal Prediction Under Covariate Shift
NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2019
View details for Web of Science ID 000534424302052
-
The Projected Power Method: An Efficient Algorithm for Joint Alignment from Pairwise Differences
COMMUNICATIONS ON PURE AND APPLIED MATHEMATICS
2018; 71 (8): 1648–1714
View details for Web of Science ID 000445204300004
-
Panning for gold: "model-X' knockoffs for high dimensional controlled variable selection
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY
2018; 80 (3): 551–77
View details for DOI 10.1111/rssb.12265
View details for Web of Science ID 000430673200005
-
FALSE DISCOVERIES OCCUR EARLY ON THE LASSO PATH
ANNALS OF STATISTICS
2017; 45 (5): 2133–50
View details for DOI 10.1214/16-AOS1521
View details for Web of Science ID 000416455300011
-
EigenPrism: inference for high dimensional signal-to-noise ratios
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY
2017; 79 (4): 1037–65
Abstract
Consider the following three important problems in statistical inference, namely, constructing confidence intervals for (1) the error of a high-dimensional (p > n) regression estimator, (2) the linear regression noise level, and (3) the genetic signal-to-noise ratio of a continuous-valued trait (related to the heritability). All three problems turn out to be closely related to the little-studied problem of performing inference on the [Formula: see text]-norm of the signal in high-dimensional linear regression. We derive a novel procedure for this, which is asymptotically correct when the covariates are multivariate Gaussian and produces valid confidence intervals in finite samples as well. The procedure, called EigenPrism, is computationally fast and makes no assumptions on coefficient sparsity or knowledge of the noise level. We investigate the width of the EigenPrism confidence intervals, including a comparison with a Bayesian setting in which our interval is just 5% wider than the Bayes credible interval. We are then able to unify the three aforementioned problems by showing that the EigenPrism procedure with only minor modifications is able to make important contributions to all three. We also investigate the robustness of coverage and find that the method applies in practice and in finite samples much more widely than just the case of multivariate Gaussian covariates. Finally, we apply EigenPrism to a genetic dataset to estimate the genetic signal-to-noise ratio for a number of continuous phenotypes.
View details for DOI 10.1111/rssb.12203
View details for Web of Science ID 000411712300002
View details for PubMedID 29104447
View details for PubMedCentralID PMC5663223
-
Solving Random Quadratic Systems of Equations Is Nearly as Easy as Solving Linear Systems
COMMUNICATIONS ON PURE AND APPLIED MATHEMATICS
2017; 70 (5): 822-883
View details for DOI 10.1002/cpa.21638
View details for Web of Science ID 000398158300002
-
Controlling the Rate of GWAS False Discoveries
GENETICS
2017; 205 (1): 61-75
Abstract
With the rise of both the number and the complexity of traits of interest, control of the false discovery rate (FDR) in genetic association studies has become an increasingly appealing and accepted target for multiple comparison adjustment. While a number of robust FDR-controlling strategies exist, the nature of this error rate is intimately tied to the precise way in which discoveries are counted, and the performance of FDR-controlling procedures is satisfactory only if there is a one-to-one correspondence between what scientists describe as unique discoveries and the number of rejected hypotheses. The presence of linkage disequilibrium between markers in genome-wide association studies (GWAS) often leads researchers to consider the signal associated to multiple neighboring SNPs as indicating the existence of a single genomic locus with possible influence on the phenotype. This a posteriori aggregation of rejected hypotheses results in inflation of the relevant FDR. We propose a novel approach to FDR control that is based on prescreening to identify the level of resolution of distinct hypotheses. We show how FDR-controlling strategies can be adapted to account for this initial selection both with theoretical results and simulations that mimic the dependence structure to be expected in GWAS. We demonstrate that our approach is versatile and useful when the data are analyzed using both tests based on single markers and multiple regression. We provide an R package that allows practitioners to apply our procedure on standard GWAS format data, and illustrate its performance on lipid traits in the North Finland Birth Cohort 66 cohort study.
View details for DOI 10.1534/genetics.116.193987
View details for Web of Science ID 000393677300004
View details for PubMedCentralID PMC5223524
-
Controlling the Rate of GWAS False Discoveries.
Genetics
2017; 205 (1): 61-75
Abstract
With the rise of both the number and the complexity of traits of interest, control of the false discovery rate (FDR) in genetic association studies has become an increasingly appealing and accepted target for multiple comparison adjustment. While a number of robust FDR-controlling strategies exist, the nature of this error rate is intimately tied to the precise way in which discoveries are counted, and the performance of FDR-controlling procedures is satisfactory only if there is a one-to-one correspondence between what scientists describe as unique discoveries and the number of rejected hypotheses. The presence of linkage disequilibrium between markers in genome-wide association studies (GWAS) often leads researchers to consider the signal associated to multiple neighboring SNPs as indicating the existence of a single genomic locus with possible influence on the phenotype. This a posteriori aggregation of rejected hypotheses results in inflation of the relevant FDR. We propose a novel approach to FDR control that is based on prescreening to identify the level of resolution of distinct hypotheses. We show how FDR-controlling strategies can be adapted to account for this initial selection both with theoretical results and simulations that mimic the dependence structure to be expected in GWAS. We demonstrate that our approach is versatile and useful when the data are analyzed using both tests based on single markers and multiple regression. We provide an R package that allows practitioners to apply our procedure on standard GWAS format data, and illustrate its performance on lipid traits in the North Finland Birth Cohort 66 cohort study.
View details for DOI 10.1534/genetics.116.193987
View details for PubMedID 27784720
View details for PubMedCentralID PMC5223524
-
SLOPE IS ADAPTIVE TO UNKNOWN SPARSITY AND ASYMPTOTICALLY MINIMAX
ANNALS OF STATISTICS
2016; 44 (3): 1038-1068
View details for DOI 10.1214/15-AOS1397
View details for Web of Science ID 000375175200006
-
A Differential Equation for Modeling Nesterov's Accelerated Gradient Method: Theory and Insights
JOURNAL OF MACHINE LEARNING RESEARCH
2016; 17
View details for Web of Science ID 000391664200001
-
Super-Resolution of Positive Sources: The Discrete Setup
SIAM JOURNAL ON IMAGING SCIENCES
2016; 9 (1): 412-444
View details for DOI 10.1137/15M1016552
View details for Web of Science ID 000373629500015
-
SLOPE-ADAPTIVE VARIABLE SELECTION VIA CONVEX OPTIMIZATION.
The annals of applied statistics
2015; 9 (3): 1103-1140
Abstract
We introduce a new estimator for the vector of coefficients β in the linear model y = Xβ + z, where X has dimensions n × p with p possibly larger than n. SLOPE, short for Sorted L-One Penalized Estimation, is the solution to [Formula: see text]where λ1 ≥ λ2 ≥ … ≥ λ p ≥ 0 and [Formula: see text] are the decreasing absolute values of the entries of b. This is a convex program and we demonstrate a solution algorithm whose computational complexity is roughly comparable to that of classical ℓ1 procedures such as the Lasso. Here, the regularizer is a sorted ℓ1 norm, which penalizes the regression coefficients according to their rank: the higher the rank-that is, stronger the signal-the larger the penalty. This is similar to the Benjamini and Hochberg [J. Roy. Statist. Soc. Ser. B57 (1995) 289-300] procedure (BH) which compares more significant p-values with more stringent thresholds. One notable choice of the sequence {λ i } is given by the BH critical values [Formula: see text], where q ∈ (0, 1) and z(α) is the quantile of a standard normal distribution. SLOPE aims to provide finite sample guarantees on the selected model; of special interest is the false discovery rate (FDR), defined as the expected proportion of irrelevant regressors among all selected predictors. Under orthogonal designs, SLOPE with λBH provably controls FDR at level q. Moreover, it also appears to have appreciable inferential properties under more general designs X while having substantial power, as demonstrated in a series of experiments running on both simulated and real data.
View details for DOI 10.1214/15-AOAS842
View details for PubMedID 26709357
View details for PubMedCentralID PMC4689150
-
CONTROLLING THE FALSE DISCOVERY RATE VIA KNOCKOFFS
ANNALS OF STATISTICS
2015; 43 (5): 2055-2085
View details for DOI 10.1214/15-AOS1337
View details for Web of Science ID 000362697700007
-
Phase retrieval from coded diffraction patterns
APPLIED AND COMPUTATIONAL HARMONIC ANALYSIS
2015; 39 (2): 277-299
View details for DOI 10.1016/j.acha.2014.09.004
View details for Web of Science ID 000356108300004
-
SLOPE-ADAPTIVE VARIABLE SELECTION VIA CONVEX OPTIMIZATION
ANNALS OF APPLIED STATISTICS
2015; 9 (3): 1103-1140
View details for DOI 10.1214/15-AOAS842
View details for Web of Science ID 000364340100001
-
Adaptive Restart for Accelerated Gradient Schemes
FOUNDATIONS OF COMPUTATIONAL MATHEMATICS
2015; 15 (3): 715-732
View details for DOI 10.1007/s10208-013-9150-3
View details for Web of Science ID 000354710400004
-
Randomized Algorithms for Low-Rank Matrix Factorizations: Sharp Performance Bounds
ALGORITHMICA
2015; 72 (1): 264-281
View details for DOI 10.1007/s00453-014-9891-7
View details for Web of Science ID 000352434300012
-
Phase Retrieval via Wirtinger Flow: Theory and Algorithms
IEEE TRANSACTIONS ON INFORMATION THEORY
2015; 61 (4): 1985-2007
View details for DOI 10.1109/TIT.2015.2399924
View details for Web of Science ID 000351470800031
-
Low-Rank Plus Sparse Matrix Decomposition for Accelerated Dynamic MRI with Separation of Background and Dynamic Components
MAGNETIC RESONANCE IN MEDICINE
2015; 73 (3): 1125-1136
Abstract
To apply the low-rank plus sparse (L+S) matrix decomposition model to reconstruct undersampled dynamic MRI as a superposition of background and dynamic components in various problems of clinical interest.The L+S model is natural to represent dynamic MRI data. Incoherence between k-t space (acquisition) and the singular vectors of L and the sparse domain of S is required to reconstruct undersampled data. Incoherence between L and S is required for robust separation of background and dynamic components. Multicoil L+S reconstruction is formulated using a convex optimization approach, where the nuclear norm is used to enforce low rank in L and the l1 norm is used to enforce sparsity in S. Feasibility of the L+S reconstruction was tested in several dynamic MRI experiments with true acceleration, including cardiac perfusion, cardiac cine, time-resolved angiography, and abdominal and breast perfusion using Cartesian and radial sampling.The L+S model increased compressibility of dynamic MRI data and thus enabled high-acceleration factors. The inherent background separation improved background suppression performance compared to conventional data subtraction, which is sensitive to motion.The high acceleration and background separation enabled by L+S promises to enhance spatial and temporal resolution and to enable background suppression without the need of subtraction or modeling.
View details for DOI 10.1002/mrm.25240
View details for Web of Science ID 000350279900025
View details for PubMedID 24760724
-
Phase Retrieval via Matrix Completion
SIAM REVIEW
2015; 57 (2): 225-251
View details for DOI 10.1137/151005099
View details for Web of Science ID 000354985600003
-
Solving Quadratic Equations via PhaseLift When There Are About as Many Equations as Unknowns
FOUNDATIONS OF COMPUTATIONAL MATHEMATICS
2014; 14 (5): 1017-1026
View details for DOI 10.1007/s10208-013-9162-z
View details for Web of Science ID 000342283800005
-
Towards a Mathematical Theory of Super- resolution
COMMUNICATIONS ON PURE AND APPLIED MATHEMATICS
2014; 67 (6): 906-956
View details for DOI 10.1002/cpa.21455
View details for Web of Science ID 000333662800002
-
ROBUST SUBSPACE CLUSTERING
ANNALS OF STATISTICS
2014; 42 (2): 669-699
View details for DOI 10.1214/13-AOS1199
View details for Web of Science ID 000336888400014
-
Super-Resolution from Noisy Data
JOURNAL OF FOURIER ANALYSIS AND APPLICATIONS
2013; 19 (6): 1229-1254
View details for DOI 10.1007/s00041-013-9292-3
View details for Web of Science ID 000328207800005
-
Unbiased Risk Estimates for Singular Value Thresholding and Spectral Estimators
IEEE TRANSACTIONS ON SIGNAL PROCESSING
2013; 61 (19): 4643-4657
View details for DOI 10.1109/TSP.2013.2270464
View details for Web of Science ID 000324342900001
-
Simple bounds for recovering low-complexity models
MATHEMATICAL PROGRAMMING
2013; 141 (1-2): 577-589
View details for DOI 10.1007/s10107-012-0540-0
View details for Web of Science ID 000324232100024
-
PhaseLift: Exact and Stable Signal Recovery from Magnitude Measurements via Convex Programming
COMMUNICATIONS ON PURE AND APPLIED MATHEMATICS
2013; 66 (8): 1241-1274
View details for DOI 10.1002/cpa.21432
View details for Web of Science ID 000319617000003
-
Single-photon sampling architecture for solid-state imaging sensors
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA
2013; 110 (30): E2752-E2761
Abstract
Advances in solid-state technology have enabled the development of silicon photomultiplier sensor arrays capable of sensing individual photons. Combined with high-frequency time-to-digital converters (TDCs), this technology opens up the prospect of sensors capable of recording with high accuracy both the time and location of each detected photon. Such a capability could lead to significant improvements in imaging accuracy, especially for applications operating with low photon fluxes such as light detection and ranging and positron-emission tomography. The demands placed on on-chip readout circuitry impose stringent trade-offs between fill factor and spatiotemporal resolution, causing many contemporary designs to severely underuse the technology's full potential. Concentrating on the low photon flux setting, this paper leverages results from group testing and proposes an architecture for a highly efficient readout of pixels using only a small number of TDCs. We provide optimized design instances for various sensor parameters and compute explicit upper and lower bounds on the number of TDCs required to uniquely decode a given maximum number of simultaneous photon arrivals. To illustrate the strength of the proposed architecture, we note a typical digitization of a 60 × 60 photodiode sensor using only 142 TDCs. The design guarantees registration and unique recovery of up to four simultaneous photon arrivals using a fast decoding algorithm. By contrast, a cross-strip design requires 120 TDCs and cannot uniquely decode any simultaneous photon arrivals. Among other realistic simulations of scintillation events in clinical positron-emission tomography, the above design is shown to recover the spatiotemporal location of 99.98% of all detected photons.
View details for DOI 10.1073/pnas.1216318110
View details for Web of Science ID 000322112300005
View details for PubMedID 23836643
-
Improving IMRT delivery efficiency with reweighted L1-minimization for inverse planning
MEDICAL PHYSICS
2013; 40 (7)
Abstract
This study presents an improved technique to further simplify the fluence-map in intensity modulated radiation therapy (IMRT) inverse planning, thereby reducing plan complexity and improving delivery efficiency, while maintaining the plan quality.First-order total-variation (TV) minimization (min.) based on L1-norm has been proposed to reduce the complexity of fluence-map in IMRT by generating sparse fluence-map variations. However, with stronger dose sparing to the critical structures, the inevitable increase in the fluence-map complexity can lead to inefficient dose delivery. Theoretically, L0-min. is the ideal solution for the sparse signal recovery problem, yet practically intractable due to its nonconvexity of the objective function. As an alternative, the authors use the iteratively reweighted L1-min. technique to incorporate the benefits of the L0-norm into the tractability of L1-min. The weight multiplied to each element is inversely related to the magnitude of the corresponding element, which is iteratively updated by the reweighting process. The proposed penalizing process combined with TV min. further improves sparsity in the fluence-map variations, hence ultimately enhancing the delivery efficiency. To validate the proposed method, this work compares three treatment plans obtained from quadratic min. (generally used in clinic IMRT), conventional TV min., and our proposed reweighted TV min. techniques, implemented by a large-scale L1-solver (template for first-order conic solver), for five patient clinical data. Criteria such as conformation number (CN), modulation index (MI), and estimated treatment time are employed to assess the relationship between the plan quality and delivery efficiency.The proposed method yields simpler fluence-maps than the quadratic and conventional TV based techniques. To attain a given CN and dose sparing to the critical organs for 5 clinical cases, the proposed method reduces the number of segments by 10-15 and 30-35, relative to TV min. and quadratic min. based plans, while MIs decreases by about 20%-30% and 40%-60% over the plans by two existing techniques, respectively. With such conditions, the total treatment time of the plans obtained from our proposed method can be reduced by 12-30 s and 30-80 s mainly due to greatly shorter multileaf collimator (MLC) traveling time in IMRT step-and-shoot delivery.The reweighted L1-minimization technique provides a promising solution to simplify the fluence-map variations in IMRT inverse planning. It improves the delivery efficiency by reducing the entire segments and treatment time, while maintaining the plan quality in terms of target conformity and critical structure sparing.
View details for DOI 10.1118/1.4811100
View details for PubMedID 23822423
-
How well can we estimate a sparse vector?
APPLIED AND COMPUTATIONAL HARMONIC ANALYSIS
2013; 34 (2): 317-323
View details for DOI 10.1016/j.acha.2012.08.010
View details for Web of Science ID 000314322300009
-
On the Fundamental Limits of Adaptive Sensing
IEEE TRANSACTIONS ON INFORMATION THEORY
2013; 59 (1): 472-481
View details for DOI 10.1109/TIT.2012.2215837
View details for Web of Science ID 000312896600028
-
Super-resolution via Transform-invariant Group-sparse Regularization
IEEE International Conference on Computer Vision (ICCV)
IEEE. 2013: 3336–3343
View details for DOI 10.1109/ICCV.2013.414
View details for Web of Science ID 000351830500417
-
Phase Retrieval via Matrix Completion
SIAM JOURNAL ON IMAGING SCIENCES
2013; 6 (1): 199-225
View details for DOI 10.1137/110848074
View details for Web of Science ID 000326032900008
-
A Nonuniform Sampler for Wideband Spectrally-Sparse Environments
IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS
2012; 2 (3): 516-529
View details for DOI 10.1109/JETCAS.2012.2214635
View details for Web of Science ID 000208972900018
-
A Compressed Sensing Parameter Extraction Platform for Radar Pulse Signal Acquisition
IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS
2012; 2 (3): 626-638
View details for DOI 10.1109/JETCAS.2012.2214634
View details for Web of Science ID 000208972900027
-
DISCUSSION: LATENT VARIABLE GRAPHICAL MODEL SELECTION VIA CONVEX OPTIMIZATION
ANNALS OF STATISTICS
2012; 40 (4): 1997-2004
View details for DOI 10.1214/12-AOS1001
View details for Web of Science ID 000312899000007
-
A GEOMETRIC ANALYSIS OF SUBSPACE CLUSTERING WITH OUTLIERS
ANNALS OF STATISTICS
2012; 40 (4): 2195-2238
View details for DOI 10.1214/12-AOS1034
View details for Web of Science ID 000321842400003
-
Dose optimization with first-order total-variation minimization for dense angularly sampled and sparse intensity modulated radiation therapy (DASSIM-RT)
MEDICAL PHYSICS
2012; 39 (7): 4316-4327
Abstract
A new treatment scheme coined as dense angularly sampled and sparse intensity modulated radiation therapy (DASSIM-RT) has recently been proposed to bridge the gap between IMRT and VMAT. By increasing the angular sampling of radiation beams while eliminating dispensable segments of the incident fields, DASSIM-RT is capable of providing improved conformity in dose distributions while maintaining high delivery efficiency. The fact that DASSIM-RT utilizes a large number of incident beams represents a major computational challenge for the clinical applications of this powerful treatment scheme. The purpose of this work is to provide a practical solution to the DASSIM-RT inverse planning problem.The inverse planning problem is formulated as a fluence-map optimization problem with total-variation (TV) minimization. A newly released L1-solver, template for first-order conic solver (TFOCS), was adopted in this work. TFOCS achieves faster convergence with less memory usage as compared with conventional quadratic programming (QP) for the TV form through the effective use of conic forms, dual-variable updates, and optimal first-order approaches. As such, it is tailored to specifically address the computational challenges of large-scale optimization in DASSIM-RT inverse planning. Two clinical cases (a prostate and a head and neck case) are used to evaluate the effectiveness and efficiency of the proposed planning technique. DASSIM-RT plans with 15 and 30 beams are compared with conventional IMRT plans with 7 beams in terms of plan quality and delivery efficiency, which are quantified by conformation number (CN), the total number of segments and modulation index, respectively. For optimization efficiency, the QP-based approach was compared with the proposed algorithm for the DASSIM-RT plans with 15 beams for both cases.Plan quality improves with an increasing number of incident beams, while the total number of segments is maintained to be about the same in both cases. For the prostate patient, the conformation number to the target was 0.7509, 0.7565, and 0.7611 with 80 segments for IMRT with 7 beams, and DASSIM-RT with 15 and 30 beams, respectively. For the head and neck (HN) patient with a complicated target shape, conformation numbers of the three treatment plans were 0.7554, 0.7758, and 0.7819 with 75 segments for all beam configurations. With respect to the dose sparing to the critical structures, the organs such as the femoral heads in the prostate case and the brainstem and spinal cord in the HN case were better protected with DASSIM-RT. For both cases, the delivery efficiency has been greatly improved as the beam angular sampling increases with the similar or better conformal dose distribution. Compared with conventional quadratic programming approaches, first-order TFOCS-based optimization achieves far faster convergence and smaller memory requirements in DASSIM-RT.The new optimization algorithm TFOCS provides a practical and timely solution to the DASSIM-RT or other inverse planning problem requiring large memory space. The new treatment scheme is shown to outperform conventional IMRT in terms of dose conformity to both the targetand the critical structures, while maintaining high delivery efficiency.
View details for DOI 10.1118/1.4729717
View details for Web of Science ID 000306893000029
View details for PubMedID 22830765
-
Compressive fluorescence microscopy for biological and hyperspectral imaging
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA
2012; 109 (26): E1679-E1687
Abstract
The mathematical theory of compressed sensing (CS) asserts that one can acquire signals from measurements whose rate is much lower than the total bandwidth. Whereas the CS theory is now well developed, challenges concerning hardware implementations of CS-based acquisition devices--especially in optics--have only started being addressed. This paper presents an implementation of compressive sensing in fluorescence microscopy and its applications to biomedical imaging. Our CS microscope combines a dynamic structured wide-field illumination and a fast and sensitive single-point fluorescence detection to enable reconstructions of images of fluorescent beads, cells, and tissues with undersampling ratios (between the number of pixels and number of measurements) up to 32. We further demonstrate a hyperspectral mode and record images with 128 spectral channels and undersampling ratios up to 64, illustrating the potential benefits of CS acquisition for higher-dimensional signals, which typically exhibits extreme redundancy. Altogether, our results emphasize the interest of CS schemes for acquisition at a significantly reduced rate and point to some remaining challenges for CS fluorescence microscopy.
View details for DOI 10.1073/pnas.1119511109
View details for Web of Science ID 000306291400004
View details for PubMedID 22689950
-
Exact Matrix Completion via Convex Optimization
COMMUNICATIONS OF THE ACM
2012; 55 (6): 111-119
View details for DOI 10.1145/2184319.2184343
View details for Web of Science ID 000304442000030
-
A Probabilistic and RIPless Theory of Compressed Sensing
IEEE TRANSACTIONS ON INFORMATION THEORY
2011; 57 (11): 7235-7254
View details for DOI 10.1109/TIT.2011.2161794
View details for Web of Science ID 000297046100002
-
GLOBAL TESTING UNDER SPARSE ALTERNATIVES: ANOVA, MULTIPLE COMPARISONS AND THE HIGHER CRITICISM
ANNALS OF STATISTICS
2011; 39 (5): 2533-2556
View details for DOI 10.1214/11-AOS910
View details for Web of Science ID 000299186500013
-
Compressed sensing with coherent and redundant dictionaries
APPLIED AND COMPUTATIONAL HARMONIC ANALYSIS
2011; 31 (1): 59-73
View details for DOI 10.1016/j.acha.2010.10.002
View details for Web of Science ID 000290831300004
-
Robust Principal Component Analysis?
JOURNAL OF THE ACM
2011; 58 (3)
View details for DOI 10.1145/1970392.1970395
View details for Web of Science ID 000291246000003
-
Tight Oracle Inequalities for Low-Rank Matrix Recovery From a Minimal Number of Noisy Random Measurements
IEEE TRANSACTIONS ON INFORMATION THEORY
2011; 57 (4): 2342-2359
View details for DOI 10.1109/TIT.2011.2111771
View details for Web of Science ID 000288459100038
-
DETECTION OF AN ANOMALOUS CLUSTER IN A NETWORK
ANNALS OF STATISTICS
2011; 39 (1): 278-304
View details for DOI 10.1214/10-AOS839
View details for Web of Science ID 000288183800009
-
NESTA: A Fast and Accurate First-Order Method for Sparse Recovery
SIAM JOURNAL ON IMAGING SCIENCES
2011; 4 (1): 1-39
View details for DOI 10.1137/090756855
View details for Web of Science ID 000288991200001
-
Matrix Completion With Noise
PROCEEDINGS OF THE IEEE
2010; 98 (6): 925-936
View details for DOI 10.1109/JPROC.2009.2035722
View details for Web of Science ID 000277884900005
-
The Power of Convex Relaxation: Near-Optimal Matrix Completion
IEEE TRANSACTIONS ON INFORMATION THEORY
2010; 56 (5): 2053-2080
View details for DOI 10.1109/TIT.2010.2044061
View details for Web of Science ID 000278067900001
-
Compressed Sensing With Quantized Measurements
IEEE SIGNAL PROCESSING LETTERS
2010; 17 (2): 149-152
View details for DOI 10.1109/LSP.2009.2035667
View details for Web of Science ID 000272046700001
-
The power of convex relaxation: the surprising stories of matrix completion and compressed sensing
21st Annual ACM/SIAM Symposium on Discrete Algorithms
SIAM. 2010: 1321–1321
View details for Web of Science ID 000280699900106
-
A SINGULAR VALUE THRESHOLDING ALGORITHM FOR MATRIX COMPLETION
SIAM JOURNAL ON OPTIMIZATION
2010; 20 (4): 1956-1982
View details for DOI 10.1137/080738970
View details for Web of Science ID 000277836700014
-
Exact Matrix Completion via Convex Optimization
FOUNDATIONS OF COMPUTATIONAL MATHEMATICS
2009; 9 (6): 717-772
View details for DOI 10.1007/s10208-009-9045-5
View details for Web of Science ID 000272299900003
-
NEAR-IDEAL MODEL SELECTION BY l(1) MINIMIZATION
ANNALS OF STATISTICS
2009; 37 (5A): 2145-2177
View details for DOI 10.1214/08-AOS653
View details for Web of Science ID 000268604900003
-
Accurate low-rank matrix recovery from a small number of linear measurements
47th Annual Allerton Conference on Communication, Control, and Computing
IEEE. 2009: 1223–1230
View details for Web of Science ID 000279627100167
-
A FAST BUTTERFLY ALGORITHM FOR THE COMPUTATION OF FOURIER INTEGRAL OPERATORS
MULTISCALE MODELING & SIMULATION
2009; 7 (4): 1727-1750
View details for DOI 10.1137/080734339
View details for Web of Science ID 000270192800009
-
Enhancing Sparsity by Reweighted l(1) Minimization
4th IEEE International Symposium on Biomedical Imaging
SPRINGER. 2008: 877–905
View details for DOI 10.1007/s00041-008-9045-x
View details for Web of Science ID 000261411300013
-
Gravitational wave detection using multiscale chirplets
CLASSICAL AND QUANTUM GRAVITY
2008; 25 (18)
View details for DOI 10.1088/0264-9381/25/18/184020
View details for Web of Science ID 000258916700021
-
Searching for a trail of evidence in a maze
ANNALS OF STATISTICS
2008; 36 (4): 1726-1757
View details for DOI 10.1214/07-AOS526
View details for Web of Science ID 000258243000012
-
Highly robust error correction by convex programming
IEEE TRANSACTIONS ON INFORMATION THEORY
2008; 54 (7): 2829-2840
View details for DOI 10.1109/TIT.2008.924688
View details for Web of Science ID 000257111500001
-
An introduction to compressive sampling
IEEE SIGNAL PROCESSING MAGAZINE
2008; 25 (2): 21-30
View details for Web of Science ID 000254471100005
-
Exact Low-rank Matrix Completion via Convex Optimization
2008 46TH ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING, VOLS 1-3
2008: 806-812
View details for Web of Science ID 000268229600114
-
Compressed Sensing and Robust Recovery of Low Rank Matrices
2008 42ND ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS, VOLS 1-4
2008: 1043-?
View details for Web of Science ID 000274551000198
-
Detecting highly oscillatory signals by chirplet path pursuit
APPLIED AND COMPUTATIONAL HARMONIC ANALYSIS
2008; 24 (1): 14-40
View details for DOI 10.1016/j.acha.2007.04.003
View details for Web of Science ID 000253341800002
-
The Dantzig selector: Statistical estimation when p is much larger than n
ANNALS OF STATISTICS
2007; 35 (6): 2313-2351
View details for DOI 10.1214/009053606000001523
View details for Web of Science ID 000253077800001
-
Errata for quantitative robust uncertainty principles and optimally sparse decompositions
FOUNDATIONS OF COMPUTATIONAL MATHEMATICS
2007; 7 (4): 529-531
View details for DOI 10.1007/s10208-007-7162-6
View details for Web of Science ID 000249873100006
-
Sparsity and incoherence in compressive sampling
INVERSE PROBLEMS
2007; 23 (3): 969-985
View details for DOI 10.1088/0266-5611/23/3/008
View details for Web of Science ID 000246789100008
-
Fast computation of Fourier integral operators
SIAM JOURNAL ON SCIENTIFIC COMPUTING
2007; 29 (6): 2464-2493
View details for Web of Science ID 000251175000011
-
Sparse signal and image recovery from Compressive Samples
2007 4TH IEEE INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING : MACRO TO NANO, VOLS 1-3
2007: 976-979
View details for Web of Science ID 000252957300245
-
The phase flow method
JOURNAL OF COMPUTATIONAL PHYSICS
2006; 220 (1): 184-215
View details for DOI 10.1016/j.jcp.2006.05.008
View details for Web of Science ID 000243079600011
-
Fast geodesics computation with the phase flow method
JOURNAL OF COMPUTATIONAL PHYSICS
2006; 220 (1): 6-18
View details for DOI 10.1016/j.jcp.2006.07.032
View details for Web of Science ID 000243079600002
-
Near-optimal signal recovery from random projections: Universal encoding strategies?
IEEE TRANSACTIONS ON INFORMATION THEORY
2006; 52 (12): 5406-5425
View details for DOI 10.1109/TIT.2006.885507
View details for Web of Science ID 000242503300015
-
Stable signal recovery from incomplete and inaccurate measurements
COMMUNICATIONS ON PURE AND APPLIED MATHEMATICS
2006; 59 (8): 1207-1223
View details for Web of Science ID 000238380500002
-
Quantitative robust uncertainty principles and optimally sparse decompositions
2nd International Conference on Computational Harmonic Analysis
SPRINGER. 2006: 227–54
View details for DOI 10.1007/s10208-004-0162-x
View details for Web of Science ID 000238289600004
-
Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information
IEEE TRANSACTIONS ON INFORMATION THEORY
2006; 52 (2): 489-509
View details for DOI 10.1109/TIT.2005.862083
View details for Web of Science ID 000234944700009
-
Fast discrete curvelet transforms
MULTISCALE MODELING & SIMULATION
2006; 5 (3): 861-899
View details for DOI 10.1137/05064182X
View details for Web of Science ID 000242572200007
-
Robust signal recovery from incomplete observations
2006 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP 2006, PROCEEDINGS
2006: 1281-1284
View details for Web of Science ID 000245768500321
-
Encoding the l(p) ball from limited measurements
DCC 2006: DATA COMPRESSION CONFERENCE, PROCEEDINGS
2006: 33-42
View details for Web of Science ID 000236995300004
-
Decoding by linear programming
IEEE TRANSACTIONS ON INFORMATION THEORY
2005; 51 (12): 4203-4215
View details for DOI 10.1109/TIT.2005.858979
View details for Web of Science ID 000233621500010
-
The curvelet representation of wave propagators is optimally sparse
COMMUNICATIONS ON PURE AND APPLIED MATHEMATICS
2005; 58 (11): 1472-1528
View details for Web of Science ID 000232147200002
-
Continuous Curvelet Transform - II. Discretization and frames
APPLIED AND COMPUTATIONAL HARMONIC ANALYSIS
2005; 19 (2): 198-222
View details for DOI 10.1016/j.acha.2005.02.004
View details for Web of Science ID 000231764700003
-
Continuous Curvelet Transform - I. Resolution of the wavefront set
APPLIED AND COMPUTATIONAL HARMONIC ANALYSIS
2005; 19 (2): 162-197
View details for DOI 10.1016/j.acha.2005.02.003
View details for Web of Science ID 000231764700002
-
Signal recovery from random projections
COMPUTATIONAL IMAGING III
2005; 5674: 76-86
View details for Web of Science ID 000228796600008
-
Error correction via linear programming
46TH ANNUAL IEEE SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE, PROCEEDINGS
2005: 295-308
View details for Web of Science ID 000234538200027
-
New tight frames of curvelets and optimal representations of objects with piecewise C-2 singularities
COMMUNICATIONS ON PURE AND APPLIED MATHEMATICS
2004; 57 (2): 219-266
View details for Web of Science ID 000187244700002
-
Ridgelets: Estimating with ridge functions
ANNALS OF STATISTICS
2003; 31 (5): 1561-1599
View details for Web of Science ID 000186185200008
-
Gray and color image contrast enhancement by the curvelet transform
IEEE TRANSACTIONS ON IMAGE PROCESSING
2003; 12 (6): 706-717
Abstract
We present in this paper a new method for contrast enhancement based on the curvelet transform. The curvelet transform represents edges better than wavelets, and is therefore well-suited for multiscale edge enhancement. We compare this approach with enhancement based on the wavelet transform, and the Multiscale Retinex. In a range of examples, we use edge detection and segmentation, among other processing applications, to provide for quantitative comparative evaluation. Our findings are that curvelet based enhancement out-performs other enhancement methods on noisy images, but on noiseless or near noiseless images curvelet based enhancement is not remarkably better than wavelet based enhancement.
View details for DOI 10.1109/TIP.2003.813140
View details for Web of Science ID 000183824600011
View details for PubMedID 18237946
-
Curvelets and Fourier integral operators
COMPTES RENDUS MATHEMATIQUE
2003; 336 (5): 395-398
View details for DOI 10.1016/S1631-073X(03)00095-5
View details for Web of Science ID 000183061700006
-
Astronomical image representation by the curvelet transform
ASTRONOMY & ASTROPHYSICS
2003; 398 (2): 785-800
View details for DOI 10.1051/0004-6361:20021571
View details for Web of Science ID 000180525400040
-
New multiscale transforms, minimum total variation synthesis: applications to edge-preserving image reconstruction
SIGNAL PROCESSING
2002; 82 (11): 1519-1543
View details for Web of Science ID 000178707700002
-
Recovering edges in ill-posed inverse problems: Optimality of curvelet frames
ANNALS OF STATISTICS
2002; 30 (3): 784-842
View details for Web of Science ID 000177354600008
-
The curvelet transform for image denoising
IEEE TRANSACTIONS ON IMAGE PROCESSING
2002; 11 (6): 670-684
Abstract
We describe approximate digital implementations of two new mathematical transforms, namely, the ridgelet transform and the curvelet transform. Our implementations offer exact reconstruction, stability against perturbations, ease of implementation, and low computational complexity. A central tool is Fourier-domain computation of an approximate digital Radon transform. We introduce a very simple interpolation in the Fourier space which takes Cartesian samples and yields samples on a rectopolar grid, which is a pseudo-polar sampling set based on a concentric squares geometry. Despite the crudeness of our interpolation, the visual performance is surprisingly good. Our ridgelet transform applies to the Radon transform a special overcomplete wavelet pyramid whose wavelets have compact support in the frequency domain. Our curvelet transform uses our ridgelet transform as a component step, and implements curvelet subbands using a filter bank of a; trous wavelet filters. Our philosophy throughout is that transforms should be overcomplete, rather than critically sampled. We apply these digital transforms to the denoising of some standard images embedded in white noise. In the tests reported here, simple thresholding of the curvelet coefficients is very competitive with "state of the art" techniques based on wavelets, including thresholding of decimated or undecimated wavelet transforms and also including tree-based Bayesian posterior mean methods. Moreover, the curvelet reconstructions exhibit higher perceptual quality than wavelet-based reconstructions, offering visually sharper images and, in particular, higher quality recovery of edges and of faint linear and curvilinear features. Existing theory for curvelet and ridgelet transforms suggests that these new approaches can outperform wavelet methods in certain image reconstruction problems. The empirical results reported here are in encouraging agreement.
View details for Web of Science ID 000176533400009
View details for PubMedID 18244665
-
Curvelets and curvilinear integrals
JOURNAL OF APPROXIMATION THEORY
2001; 113 (1): 59-90
View details for Web of Science ID 000172408200003
-
Ridgelets and the representation of mutilated Sobolev functions
SIAM JOURNAL ON MATHEMATICAL ANALYSIS
2001; 33 (2): 347-368
View details for Web of Science ID 000171175400004
-
Very high quality image restoration by combining wavelets and curvelets
WAVELETS: APPLICATIONS IN SIGNAL AND IMAGE PROCESSING IX
2001; 4478: 9-19
View details for Web of Science ID 000175161900002
-
Curvelets and reconstruction of images from noisy radon data
Conference on Wavelet Applications in Signal and Image Processing VIII
SPIE-INT SOC OPTICAL ENGINEERING. 2000: 108–117
View details for Web of Science ID 000167102800010
-
Curvelets, multiresolution representation, and scaling laws
Conference on Wavelet Applications in Signal and Image Processing VIII
SPIE-INT SOC OPTICAL ENGINEERING. 2000: 1–12
View details for Web of Science ID 000167102800001
-
Ridgelets: a key to higher-dimensional intermittency?
PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY A-MATHEMATICAL PHYSICAL AND ENGINEERING SCIENCES
1999; 357 (1760): 2495-2509
View details for Web of Science ID 000082998700007
-
Harmonic analysis of neural networks
APPLIED AND COMPUTATIONAL HARMONIC ANALYSIS
1999; 6 (2): 197-218
View details for Web of Science ID 000078980800003