All Publications
-
Causal Inference for Genomic Data with Multiple Heterogeneous Outcomes
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
2025
Abstract
With the evolution of single-cell RNA sequencing techniques into a standard approach in genomics, it has become possible to conduct cohort-level causal inferences based on single-cell-level measurements. However, the individual gene expression levels of interest are not directly observable; instead, only repeated proxy measurements from each individual's cells are available, providing a derived outcome to estimate the underlying outcome for each of many genes. In this paper, we propose a generic semiparametric inference framework for doubly robust estimation with multiple derived outcomes, which also encompasses the usual setting of multiple outcomes when the response of each unit is available. To reliably quantify the causal effects of heterogeneous outcomes, we specialize the analysis to standardized average treatment effects and quantile treatment effects. Through this, we demonstrate the use of the semiparametric inferential results for doubly robust estimators derived from both Von Mises expansions and estimating equations. A multiple testing procedure based on Gaussian multiplier bootstrap is tailored for doubly robust estimators to control the false discovery exceedance rate. Applications in single-cell CRISPR perturbation analysis and individual-level differential expression analysis demonstrate the utility of the proposed methods and offer insights into the usage of different estimands for causal inference in genomics.
View details for DOI 10.1080/01621459.2025.2468014
View details for Web of Science ID 001504249100001
View details for PubMedID 41050632
View details for PubMedCentralID PMC12490434
-
Covariate-assisted bounds on causal effects with instrumental variables
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY
2025; 87 (5): 1508-1527
Abstract
When an exposure of interest is confounded by unmeasured factors, an instrumental variable (IV) can be used to identify and estimate certain causal contrasts. Identification of the marginal average treatment effect (ATE) from IVs relies on strong untestable structural assumptions. When one is unwilling to assert such structure, IVs can nonetheless be used to construct bounds on the ATE. Famously, Alexander Balke and Judea Pearl proved tight bounds on the ATE for a binary outcome, in a randomized trial with noncompliance and no covariate information. We demonstrate how these bounds remain useful in observational settings with baseline confounders of the IV, as well as randomized trials with measured baseline covariates. The resulting bounds on the ATE are nonsmooth functionals, and thus standard nonparametric efficiency theory is not immediately applicable. To remedy this, we propose (1) under a novel margin condition, influence function-based estimators of the bounds that can attain parametric convergence rates when the nuisance functions are modelled flexibly, and (2) estimators of smooth approximations of these bounds. We propose extensions to continuous outcomes, explore finite sample properties in simulations, and illustrate the proposed estimators in an observational study targeting the effect of higher education on wages.
View details for DOI 10.1093/jrsssb/qkaf028
View details for Web of Science ID 001497018000001
View details for PubMedID 41230541
View details for PubMedCentralID PMC12602419
https://orcid.org/0000-0003-1430-6118