Professor Baiocchi is a PhD statistician in Stanford University's Epidemiology and Population Health Department. He thinks a lot about behavioral interventions and how to rigorously evaluate if and how they work. Methodologically, his work focuses on creating statistically rigorous methods for causal inference that are transparent and easy to critique. He designed -- and was the principle investigator for -- two large randomized studies of interventions to prevent sexual assault in the settlements of Nairobi, Kenya.

Professor Baiocchi is an interventional statistician (i.e., grounded in both the creation and evaluation of interventions). The unifying idea in his research is that he brings rigorous, quantitative approaches to bear upon messy, real-world questions to better people's lives.

Academic Appointments

Professional Education

  • PhD, University of Pennsylvania, Statistics (2011)
  • BA, Williams College, Mathematics (2003)


  • Rape prevention intervention, Stanford University


    Nairobi, Kenya

  • Anti-human trafficking



  • Cardiothoracic surgery



  • Causal inference methods



  • Foundations of machine learning



  • Anti-addiction interventions



  • Causal inference with free text



2023-24 Courses

Stanford Advisees

All Publications

  • Robust Designs for Prospective Randomized Trials Surveying Sensitive Topics. American journal of epidemiology Rosenman, E. T., Friedberg, R., Baiocchi, M. 2023


    We consider the problem of designing a prospective randomized trial in which the outcome data will be self-reported, and will involve sensitive topics. Our interest is in how a researcher can adequately power her study when some respondents misreport the binary outcome of interest. To correct the power calculations, we first obtain expressions for the bias and variance induced by misreporting. We model the problem by assuming each individual in our study is a member of one "reporting class": aTrue-reporter, False-reporter, Never-reporter, or Always-reporter. We show that the joint distribution of reporting classes and "response classes" (characterizing individuals' response to the treatment) will exactly define the error terms for our causalestimate. We propose a novel procedure for determining adequate sample sizes under the worst-case power corresponding to a given level of misreporting. Our problem is motivated by prior experience implementing a randomized controlled trial of a sexual violence prevention program among adolescent girls in Kenya.

    View details for DOI 10.1093/aje/kwad027

    View details for PubMedID 36749012

  • Combining observational and experimental datasets using shrinkage estimators. Biometrics Rosenman, E. T., Basse, G., Owen, A. B., Baiocchi, M. 2023


    We consider the problem of combining data from observational and experimental sources to draw causal conclusions. To derive combined estimators with desirable properties, we extend results from the Stein shrinkage literature. Our contributions are threefold. First, we propose a generic procedure for deriving shrinkage estimators in this setting, making use of a generalized unbiased risk estimate. Second, we develop two new estimators, prove finite sample conditions under which they have lower risk than an estimator using only experimental data, and show that each achieves a notion of asymptotic optimality. Third, we draw connections between our approach and results in sensitivity analysis, including proposing a method for evaluating the feasibility of ourestimators. This article is protected by copyright. All rights reserved.

    View details for DOI 10.1111/biom.13827

    View details for PubMedID 36629736

  • Propensity score methods for merging observational and experimental datasets. Statistics in medicine Rosenman, E. T., Owen, A. B., Baiocchi, M., Banack, H. R. 2021


    We consider how to merge a limited amount of data from a randomized controlled trial (RCT) into a much larger set of data from an observational data base (ODB), to estimate an average causal treatment effect. Our methods are based on stratification. The strata are defined in terms of effect moderators as well as propensity scores estimated in the ODB. Data from the RCT are placed into the strata they would have occupied, had they been in the ODB instead. We assume that treatment differences are comparable in the two data sources. Our first "spiked-in" method simply inserts the RCT data into their corresponding ODB strata. We also consider a data-driven convex combination of the ODB and RCT treatment effect estimates within each stratum. Using the delta method and simulations, we identify a bias problem with the spiked-in estimator that is ameliorated by the convex combination estimator. We apply our methods to data from the Women's Health Initiative, a study of thousands of postmenopausal women which has both observational and experimental data on hormone therapy (HT). Using half of the RCT to define a gold standard, we find that a version of the spiked-in estimator yields lower-MSE estimates of the causal impact of HT on coronary heart disease than would be achieved using either a small RCT or the observational component on its own.

    View details for DOI 10.1002/sim.9223

    View details for PubMedID 34671998

  • Cohort Profile: WELL living laboratory in China (WELL-China). International journal of epidemiology Min, Y., Zhao, X., Hsing, A. W., Zhu, S. 2021

    View details for DOI 10.1093/ije/dyaa283

    View details for PubMedID 33712826

  • Evaluation of Data Sharing After Implementation of the International Committee of Medical Journal Editors Data Sharing Statement Requirement. JAMA network open Danchev, V. n., Min, Y. n., Borghi, J. n., Baiocchi, M. n., Ioannidis, J. P. 2021; 4 (1): e2033972


    The benefits of responsible sharing of individual-participant data (IPD) from clinical studies are well recognized, but stakeholders often disagree on how to align those benefits with privacy risks, costs, and incentives for clinical trialists and sponsors. The International Committee of Medical Journal Editors (ICMJE) required a data sharing statement (DSS) from submissions reporting clinical trials effective July 1, 2018. The required DSSs provide a window into current data sharing rates, practices, and norms among trialists and sponsors.To evaluate the implementation of the ICMJE DSS requirement in 3 leading medical journals: JAMA, Lancet, and New England Journal of Medicine (NEJM).This is a cross-sectional study of clinical trial reports published as articles in JAMA, Lancet, and NEJM between July 1, 2018, and April 4, 2020. Articles not eligible for DSS, including observational studies and letters or correspondence, were excluded. A MEDLINE/PubMed search identified 487 eligible clinical trials in JAMA (112 trials), Lancet (147 trials), and NEJM (228 trials). Two reviewers evaluated each of the 487 articles independently.Publication of clinical trial reports in an ICMJE medical journal requiring a DSS.The primary outcomes of the study were declared data availability and actual data availability in repositories. Other captured outcomes were data type, access, and conditions and reasons for data availability or unavailability. Associations with funding sources were examined.A total of 334 of 487 articles (68.6%; 95% CI, 64%-73%) declared data sharing, with nonindustry NIH-funded trials exhibiting the highest rates of declared data sharing (89%; 95% CI, 80%-98%) and industry-funded trials the lowest (61%; 95% CI, 54%-68%). However, only 2 IPD sets (0.6%; 95% CI, 0.0%-1.5%) were actually deidentified and publicly available as of April 10, 2020. The remaining were supposedly accessible via request to authors (143 of 334 articles [42.8%]), repository (89 of 334 articles [26.6%]), and company (78 of 334 articles [23.4%]). Among the 89 articles declaring that IPD would be stored in repositories, only 17 (19.1%) deposited data, mostly because of embargo and regulatory approval. Embargo was set in 47.3% of data-sharing articles (158 of 334), and in half of them the period exceeded 1 year or was unspecified.Most trials published in JAMA, Lancet, and NEJM after the implementation of the ICMJE policy declared their intent to make clinical data available. However, a wide gap between declared and actual data sharing exists. To improve transparency and data reuse, journals should promote the use of unique pointers to data set location and standardized choices for embargo periods and access requirements.

    View details for DOI 10.1001/jamanetworkopen.2020.33972

    View details for PubMedID 33507256

  • Longitudinal trends in e-cigarette devices used by Californian youth, 2014-2018. Addictive behaviors Lin, C., Baiocchi, M., Halpern-Felsher, B. 2020; 108: 106459


    The rate of adolescent and young adult (AYA) e-cigarette usage has increased in recent years, possibly due to the introduction of sleek new e-cigarette devices such as JUUL. This study analyzed data from 400 California AYA to examine trends in e-cigarette usage by device type (disposables, large-size rechargeables, vape/hookah pens, JUUL/pod-based). Participants were asked about their ever, past 30-day, and past 7-day use of e-cigarettes; their usual e-cigarette device used; and co-use of devices in seven surveys administered approximately biannually from 2014 to 2018. During this time period, total e-cigarette ever-usage in our cohort increased linearly from 14.1% to 46.2% (ptrend<0.001). JUUL/pod-based e-cigarette ever-usage increased from 14.9% to 22.5% in just six months in 2018. Furthermore, a majority of new e-cigarette users at the time of the survey endorsed using JUUL/pod-based devices (58.3% in Wave 6, 73.0% in Wave 7). With newer device options, AYA were also increasingly less likely to endorse older models such as disposables (19.1% to 6.9% from 2014 to 2018, ptrend<0.01) and rechargeables (69.1% to 26.2% from 2014 to 2018, ptrend<0.001) as their usual e-cigarette device. Participants who used JUUL/pod-based only as their usual device were more likely to endorse using only JUUL/pod-based devices during follow-up survey (70%), and none switched to a new device completely. Overall, this study provides a snapshot of how AYA's e-cigarette preferences appear to respond to new devices entering the market.

    View details for DOI 10.1016/j.addbeh.2020.106459

    View details for PubMedID 32388394

  • Did internal displacement from the 2010 earthquake in Haiti lead to long-term violence against children? A matched pairs study design. Child abuse & neglect Cerna-Turoff, I., Kane, J. C., Devries, K., Mercy, J., Massetti, G., Baiocchi, M. 2020; 102: 104393


    BACKGROUND: Empirical evidence is limited and contradictory on violence against children after internal displacement from natural disasters. Understanding how internal displacement affects violence is key in structuring effective prevention and response.OBJECTIVE: We examined the effect of internal displacement from the 2010 Haitian earthquake on long-term physical, emotional, and sexual violence against children and outlined a methodological framework to improve future evidence quality.PARTICIPANTS AND SETTING: We analyzed violence against adolescent girls and boys within the nationally representative, Haiti Violence Against Children Survey.METHODS: We pre-processed data by matching on pre-earthquake characteristics for displaced and non-displaced children and applied 95 % confidence intervals from McNemar's exact test, with sensitivity analyses, to evaluate differences in violence outcomes between matched pairs after the earthquake.RESULTS: Internal displacement was not associated with past 12-month physical, emotional, and sexual violence two years after the earthquake for girls and boys. Most violence outcomes were robust to potential unmeasured confounding. Odds ratios for any form of violence against girls were 0.84 (95 % CI: 0.52-1.33, p = 0.500) and against boys were 1.03 (95 % CI: 0.61-1.73, p = 1.000).CONCLUSIONS: Internal displacement was not a driver of long-term violence against children in Haiti. Current global protocols in disaster settings may initiate services after the optimal window of time to protect children from violence, and the post-displacement setting may be central in determining violence outcomes. The combination of specific data structures and matching methodologies is promising to increase evidence quality after rapid-onset natural disasters, especially in low-resource settings.

    View details for DOI 10.1016/j.chiabu.2020.104393

    View details for PubMedID 32062165

  • Assessment of a Real-Time Locator System to Identify Physician and Nurse Work Locations. JAMA network open Li, R. C., Marafino, B. J., Nielsen, D., Baiocchi, M., Shieh, L. 2020; 3 (2): e1920352

    View details for DOI 10.1001/jamanetworkopen.2019.20352

    View details for PubMedID 32022876

  • Predicting preventable hospital readmissions with causal machine learning. Health services research Marafino, B. J., Schuler, A. n., Liu, V. X., Escobar, G. J., Baiocchi, M. n. 2020


    To assess both the feasibility and potential impact of predicting preventable hospital readmissions using causal machine learning applied to data from the implementation of a readmissions prevention intervention (the Transitions Program).Electronic health records maintained by Kaiser Permanente Northern California (KPNC).Retrospective causal forest analysis of postdischarge outcomes among KPNC inpatients. Using data from both before and after implementation, we apply causal forests to estimate individual-level treatment effects of the Transitions Program intervention on 30-day readmission. These estimates are used to characterize treatment effect heterogeneity and to assess the notional impacts of alternative targeting strategies in terms of the number of readmissions prevented.1 539 285 index hospitalizations meeting the inclusion criteria and occurring between June 2010 and December 2018 at 21 KPNC hospitals.There appears to be substantial heterogeneity in patients' responses to the intervention (omnibus test for heterogeneity p = 2.23 × 10-7 ), particularly across levels of predicted risk. Notably, predicted treatment effects become more positive as predicted risk increases; patients at somewhat lower risk appear to have the largest predicted effects. Moreover, these estimates appear to be well calibrated, yielding the same estimate of annual readmissions prevented in the actual treatment subgroup (1246, 95% confidence interval [CI] 1110-1381) as did a formal evaluation of the Transitions Program (1210, 95% CI 990-1430). Estimates of the impacts of alternative targeting strategies suggest that as many as 4458 (95% CI 3925-4990) readmissions could be prevented annually, while decreasing the number needed to treat from 33 to 23, by targeting patients with the largest predicted effects rather than those at highest risk.Causal machine learning can be used to identify preventable hospital readmissions, if the requisite interventional data are available. Moreover, our results suggest a mismatch between risk and treatment effects.

    View details for DOI 10.1111/1475-6773.13586

    View details for PubMedID 33125706

  • Correction. Statistics in medicine Baiocchi, M. n., Cheng, J. n., Small, D. n. 2020

    View details for DOI 10.1002/sim.8567

    View details for PubMedID 32441377

  • A cigarette pack by any other color: Youth perceptions mostly align with tobacco industry-ascribed meanings. Preventive medicine reports McKelvey, K., Baiocchi, M., Lazaro, A., Ramamurthi, D., Halpern-Felsher, B. 2019; 14: 100830


    Youth interpret cigarette pack-colors in line with industry-intended associations.Product-packaging restrictions may be circumvented by use of colors that misrepresent product harms.43.2% of participants attributed extra strong to the black cigarette pack.35.6% of participants ascribed rich to gold.31.1% of participants ascribed menthol to green.

    View details for PubMedID 30815339

  • Which Deteriorating Ward Patients Benefit From Transfer to the Intensive Care Unit?: Critically Engaging Methods in a Well-Designed Natural Experiment. JAMA network open Baiocchi, M. 2019; 2 (2): e187698

    View details for DOI 10.1001/jamanetworkopen.2018.7698

    View details for PubMedID 30768187

  • Sex-specific association between gut microbiome and fat distribution. Nature communications Min, Y. n., Ma, X. n., Sankaran, K. n., Ru, Y. n., Chen, L. n., Baiocchi, M. n., Zhu, S. n. 2019; 10 (1): 2408


    The gut microbiome has been linked to host obesity; however, sex-specific associations between microbiome and fat distribution are not well understood. Here we show sex-specific microbiome signatures contributing to obesity despite both sexes having similar gut microbiome characteristics, including overall abundance and diversity. Our comparisons of the taxa associated with the android fat ratio in men and women found that there is no widespread species-level overlap. We did observe overlap between the sexes at the genus and family levels in the gut microbiome, such as Holdemanella and Gemmiger; however, they had opposite correlations with fat distribution in men and women. Our findings support a role for fat distribution in sex-specific relationships with the composition of the microbiome. Our results suggest that studies of the gut microbiome and abdominal obesity-related disease outcomes should account for sex-specific differences.

    View details for DOI 10.1038/s41467-019-10440-5

    View details for PubMedID 31160598

  • Adolescents' and Young Adults' Use and Perceptions of Pod-Based Electronic Cigarettes. JAMA network open McKelvey, K., Baiocchi, M., Halpern-Felsher, B. 2018; 1 (6): e183535


    Electronic cigarettes (e-cigarettes) are the most commonly used tobacco product among adolescents and young adults, and the new pod-based e-cigarette devices may put adolescents and young adults at increased risk for polytobacco use and nicotine dependence.To build an evidence base for perceptions of risk from and use of pod-based e-cigarettes among adolescents and young adults.In a survey study, a cross-sectional analysis was performed of data collected from April 6 to June 20, 2018, from 445 California adolescents and young adults as part of an ongoing prospective cohort study designed to measure the use and perceptions of tobacco products.Use of pod-based e-cigarettes, e-cigarettes, and cigarettes.Ever use, past 7-day use, and past 30-day use and co-use of pod-based e-cigarettes, e-cigarettes, and cigarettes; use of flavors and nicotine in pod-based e-cigarettes and e-cigarettes; and associated perceptions of risks, benefits, and nicotine dependence.Among 445 adolescents and young adults (280 females, 140 males, 6 transgender individuals, and 19 missing data; mean [SD] age, 19.3 [1.7] years) who completed wave 6 of the ongoing prospective cohort study, ever use information was provided by 437 respondents, of which 68 (15.6%) reported use of pod-based e-cigarettes, 133 (30.4%) reported use of e-cigarettes, and 106 (24.3%) reported use of cigarettes. The mean (SD) number of days that pod-based e-cigarettes were used in the past 7 days was 1.5 (2.4) and in the past 30 days was 6.7 (10.0). The mean (SD) number of days that other e-cigarettes were used in the past 7 days was 0.8 (1.8) and in the past 30 days was 3.2 (7.4). The mean (SD) number of days that cigarettes were used in the past 7 days was 0.7 (1.8) and in the past 30 days was 3.0 (7.6). Among ever users of pod-based e-cigarettes, 18 (26.5%) reported their first e-liquid was flavored menthol or mint and 19 (27.9%) reported fruit (vs 13 [9.8%] and 50 [37.6%] for other e-cigarettes). The mean perceived chance of experiencing social risks and short-term and long-term health risks from the use of either pod-based e-cigarettes or other e-cigarettes was 40% and did not differ statistically by e-cigarette type. Among 34 adolescents and young adults reporting any loss of autonomy from nicotine, there was no difference in mean (SD) Hooked On Nicotine Checklist scores between those using pod-based e-cigarettes (2.59 [3.14]) and other e-cigarettes (2.32 [2.55]).Use by adolescents and young adults of newer types of e-cigarettes such as pod-based systems is increasing rapidly, and adolescents and young adults report corresponding misperceptions and lack of knowledge about these products. Rapid innovation by e-cigarette manufacturers suggests that public health and prevention efforts appear to be needed to include messages targeting components common to all current and emerging e-cigarette products to increase knowledge and decrease misperceptions, with the goal to try to ultimately reduce e-cigarette use among adolescents and young adults.

    View details for DOI 10.1001/jamanetworkopen.2018.3535

    View details for PubMedID 30646249

    View details for PubMedCentralID PMC6324423

  • Adolescents' and Young Adults' Use and Perceptions of Pod-Based Electronic Cigarettes JAMA NETWORK OPEN McKelvey, K., Baiocchi, M., Halpern-Felsher, B. 2018; 1 (6)
  • An Evaluation of Clinical Order Patterns Machine-Learned from Clinician Cohorts Stratified by Patient Mortality Outcomes. Journal of biomedical informatics Wang, J. K., Hom, J., Balasubramanian, S., Schuler, A., Shah, N. H., Goldstein, M. K., Baiocchi, M. T., Chen, J. H. 2018


    OBJECTIVE: Evaluate the quality of clinical order practice patterns machine-learned from clinician cohorts stratified by patient mortality outcomes.MATERIALS AND METHODS: Inpatient electronic health records from 2010-2013 were extracted from a tertiary academic hospital. Clinicians (n=1,822) were stratified into low-mortality (21.8%, n=397) and high-mortality (6.0%, n=110) extremes using a two-sided P-value score quantifying deviation of observed vs. expected 30-day patient mortality rates. Three patient cohorts were assembled: patients seen by low-mortality clinicians, high-mortality clinicians, and an unfiltered crowd of all clinicians (n=1,046, 1,046, and 5,230 post-propensity score matching, respectively). Predicted order lists were automatically generated from recommender system algorithms trained on each patient cohort and evaluated against i) real-world practice patterns reflected in patient cases with better-than-expected mortality outcomes and ii) reference standards derived from clinical practice guidelines.RESULTS: Across six common admission diagnoses, order lists learned from the crowd demonstrated the greatest alignment with guideline references (AUROC range=0.86-0.91), performing on par or better than those learned from low-mortality clinicians (0.79-0.84, P<10-5) or manually-authored hospital order sets (0.65-0.77, P<10-3). The same trend was observed in evaluating model predictions against better-than-expected patient cases, with the crowd model (AUROC mean=0.91) outperforming the low-mortality model (0.87, P<10-16) and order set benchmarks (0.78, P<10-35).DISCUSSION: Whether machine-learning models are trained on all clinicians or a subset of experts illustrates a bias-variance tradeoff in data usage. Defining robust metrics to assess quality based on internal (e.g. practice patterns from better-than-expected patient cases) or external reference standards (e.g. clinical practice guidelines) is critical to assess decision support content.CONCLUSION: Learning relevant decision support content from all clinicians is as, if not more, robust than learning from a select subgroup of clinicians favored by patient outcomes.

    View details for PubMedID 30195660

  • Youth say ads for flavored e-liquids are for them. Addictive behaviors McKelvey, K., Baiocchi, M., Ramamurthi, D., McLaughlin, S., Halpern-Felsher, B. 2018


    INTRODUCTION: E-cigarettes are the most popular tobacco product among adolescents and young adults ("AYA") and are available in many flavors. The e-cigarette industry argues that flavors are not meant to appeal to youth, yet no study has asked youth what age group they think ads for flavored e-liquids are targeting. We asked AYA which age group they thought ads for flavored e-liquids targeted.METHODS: In 2016 as part of a larger survey, a random sample of 255 youth from across California (62.4% female, mean age = 17.5, SD = 1.7) viewed eight ads, presented in randomized order, for fruit-, dessert-, alcohol-, and coffee-flavored e-liquids and indicated the age group they thought the ads targeted: younger, same age, a little older, or much older than them. Population means and 95% confidence intervals were estimated using bootstrapping (100,000 replicate samples).RESULTS: Most participants (93.7%) indicated the cupcake man flavor ad targeted an audience of people younger than they. Over half felt ads for smoothy (68.2%), cherry (63.9%), vanilla cupcake (58%), and caramel cappuccino (50.4%) targeted their age and for no flavor ad did most feel the primary target age group was much older.CONCLUSIONS: Youth believe ads for flavored e-liquids target individuals about their age, not older adults. Findings support the need to regulate flavored e-liquids and associated ads to reduce youth appeal, which ultimately could reduce youth use of e-cigarettes.

    View details for PubMedID 30314868

  • Near-Far Matching in R: The nearfar Package JOURNAL OF STATISTICAL SOFTWARE Rigdon, J., Baiocchi, M., Basu, S. 2018; 86 (CN5): 1–21
  • Preventing false discovery of heterogeneous treatment effect subgroups in randomized trials. Trials Rigdon, J., Baiocchi, M., Basu, S. 2018; 19 (1): 382


    BACKGROUND: Heterogeneous treatment effects (HTEs), or systematic differences in treatment effectiveness among participants with different observable features, may be important when applying trial results to clinical practice. Current methods suffer from a potential for false detection of HTEs due to imbalances in covariates between candidate subgroups.METHODS: We introduce a new method, matching plus classification and regressiontrees (mCART), that yields balance in covariates in identified HTE subgroups. We compared mCART to a classical method (logistic regression [LR] with backwards covariate selection using the Akaike information criterion ) and two machine-learning approaches increasingly applied to HTE detection (random forest [RF] and gradient RF) in simulations with a binary outcome with known HTE subgroups. We considered an N=200 phase II oncology trial where there were either no HTEs (1A) or two HTE subgroups (1B) and an N=6000 phase III cardiovascular disease trial where there were either no HTEs (2A) or four HTE subgroups (2B). Additionally, we considered an N=6000 phase III cardiovascular disease trial where there was no average treatment effect but there were four HTE subgroups (2C).RESULTS: In simulations 1A and 2A (no HTEs), mCART did not identify any HTE subgroups, whereas LR found 2 and 448, RF 5 and 2, and gradient RF 5 and 24, respectively (all false positives). In simulation 1B, mCART failed to identify the two true HTE subgroups whereas LR found 4, RF 6, and gradient RF 10 (half or more of which were false positives). In simulations 2B and 2C, mCART captured the four true HTE subgroups, whereas the other methods found only false positives. All HTE subgroups identified by mCART had acceptabletreated vs.control covariate balance with absolute standardized differences less than 0.2, whereas the absolute standardized differences for the other methods typically exceeded 0.2. The imbalance in covariates in identified subgroups for LR, RF, and gradient RF indicates the false HTE detection may have been due to confounding.CONCLUSIONS: Covariate imbalances may be producing false positives in subgroup analyses. mCART could be a useful tool to help prevent the false discovery of HTE subgroups in secondary analyses of randomized trial data.

    View details for PubMedID 30012181

  • The Effect of Combining Business Training, Microfinance, and Support Group Participation on Economic Status and Intimate Partner Violence in an Unplanned Settlement of Nairobi, Kenya. Journal of interpersonal violence Sarnquist, C. C., Ouma, L., Lang'at, N., Lubanga, C., Sinclair, J., Baiocchi, M. T., Cornfield, D. N. 2018: 886260518779067


    Intimate partner violence (IPV) has myriad negative health and economic consequences for women and families. We hypothesized that empowering women through a combination of formal business training, microfinance, and IPV support groups would decrease IPV and improve women's economic status. The study included adult female survivors of severe IPV. Women living in Korogocho received the intervention and women in Dandora served as a standard of care (SOC) group, but received the intervention at the end of the follow-up period. Women in the intervention groups ( n = 82, SOC group, n = 81) received 8 weeks of business training, assistance creating a business plan, a small initial loan (about US$60), and weekly business and social support meetings. The two primary outcome measures included change in: (a) average daily profit margin, and (b) incidence of severe IPV. Exploratory analysis also looked at incidence of violence against children and women's self-efficacy. Average daily profit margin in the intervention group increased by 351 Kenyan Shillings (about US$3.5) daily (95% CI = [172, 485]). IPV directed against participating women decreased from a baseline of 2.1 to 0.26 incidents, a difference of 1.84 incidents (95% CI = [1.32, 2.36]). Violence against children in the household in the prior 3 months decreased from 1.1 to 0.55 incidents, a difference of 0.55 incidents (95% CI = [0.16, 1.03]). Finally, the intervention appears to have increased self-efficacy scores by 0.42 points (95% CIs 0.13, 0.71). In a low-resource urban environment, employing three complementary interventions resulted in higher daily profit margins and lower IPV in the intervention compared with the SOC group. These data support the notion that employing multiple interventions concomitantly might possess synergistic, beneficial effects, and hold promise to address profound poverty and interrupt the devastating cycle of IPV.

    View details for PubMedID 29862883

  • Inpatient Clinical Order Patterns Machine-Learned From Teaching Versus Attending-Only Medical Services. AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science Wang, J. K., Schuler, A., Shah, N. H., Baiocchi, M. T., Chen, J. H. 2018; 2017: 226–35


    Clinical order patterns derived from data-mining electronic health records can be a valuable source of decision support content. However, the quality of crowdsourcing such patterns may be suspect depending on the population learned from. For example, it is unclear whether learning inpatient practice patterns from a university teaching service, characterized by physician-trainee teams with an emphasis on medical education, will be of variable quality versus an attending-only medical service that focuses strictly on clinical care. Machine learning clinical order patterns by association rule episode mining from teaching versus attending-only inpatient medical services illustrated some practice variability, but converged towards similar top results in either case. We further validated the automatically generated content by confirming alignment with external reference standards extracted from clinical practice guidelines.

    View details for PubMedID 29888077

  • Second Arterial Versus Venous Conduits for Multivessel Coronary Artery Bypass Surgery in California. Circulation Goldstone, A. B., Chiu, P. n., Baiocchi, M. n., Wang, H. n., Lingala, B. n., Boyd, J. H., Woo, Y. J. 2018; 137 (16): 1698–1707


    Whether a second arterial conduit improves outcomes after multivessel coronary artery bypass grafting remains unclear. Consequently, arterial conduits other than the left internal thoracic artery are seldom used in the United States.Using a state-maintained clinical registry including all 126 nonfederal hospitals in California, we compared all-cause mortality and rates of stroke, myocardial infarction, repeat revascularization, and sternal wound infection between propensity score-matched cohorts who underwent primary, isolated multivessel coronary artery bypass grafting with the left internal thoracic artery, and who received a second arterial conduit (right internal thoracic artery or radial artery, n=5866) or a venous conduit (n=53 566) between 2006 and 2011. Propensity score matching using 34 preoperative characteristics yielded 5813 matched sets. A subgroup analysis compared outcomes between propensity score-matched recipients of a right internal thoracic artery (n=1576) or a radial artery (n=4290).Second arterial conduit use decreased from 10.7% in 2006 to 9.1% in 2011 (P<0.0001). However, receipt of a second arterial conduit was associated with significantly lower mortality (13.1% versus 10.6% at 7 years; hazard ratio, 0.79; 95% confidence interval [CI], 0.72-0.87), and lower risks of myocardial infarction (hazard ratio, 0.78; 95% CI, 0.70-0.87) and repeat revascularization (hazard ratio, 0.82; 95% CI, 0.76-0.88). In comparison with radial artery grafts, right internal thoracic artery grafts were associated with similar mortality rates (right internal thoracic artery 10.3% versus radial artery 10.7% at 7 years; hazard ratio, 1.10; 95% CI, 0.89-1.37) and individual risks of cardiovascular events, but the risk of sternal wound infection was increased (risk difference, 1.07%; 95% CI, 0.15-2.07).Second arterial conduit use in California is low and declining, but arterial grafts were associated with significantly lower mortality and fewer cardiovascular events. A right internal thoracic artery graft offered no benefit over that of a radial artery, but did increase risk of sternal wound infection. These findings suggest surgeons should consider lowering their threshold for using arterial grafts, and the radial artery may be the preferred second conduit.

    View details for PubMedID 29242351

  • Impact of Discordant Views in the Management of Descending Thoracic Aortic Aneurysm SEMINARS IN THORACIC AND CARDIOVASCULAR SURGERY Chiu, P., Sailer, A., Baiocchi, M., Goldstone, A. B., Schaffer, J. M., Trojan, J., Fleischmann, D., Mitchell, R., Miller, D., Dake, M. D., Woo, Y., Lee, J. T., Fischbein, M. P. 2017; 29 (3): 283–91


    Thoracic endovascular aortic repair has a lower perceived risk than open surgical repair and has become an increasingly popular alternative. Whether general consensus exists regarding candidacy for either operation among open and endovascular specialists is unknown. A retrospective review of isolated descending thoracic aortic aneurysm at our institution between January 2005 and October 2015 was performed, excluding trauma and dissection. Two cardiac surgeons, 2 cardiovascular surgeons, 1 vascular surgeon, and 1 interventional radiologist gave their preference for open vs endovascular repair. Interobserver agreement was assessed with the kappa coefficient. k-means clustering agnostically grouped various patterns of agreement. The mean rating was predicted using least absolute shrinkage and selection operator regression. Negative binomial regression predicted the discrepancy between our panel of raters and the historical operation. Generalized estimating equation modeling was then used to evaluate the association between the extent of discrepancy and the adverse perioperative outcome. There were 77 patients with preoperative imaging studies. Pairwise interobserver agreement was only fair (median weighted kappa 0.270 [interquartile range 0.211-0.404]). Increasing age and proximal neck length predicted an increasing preference for thoracic endovascular aortic repair in our panel; larger proximal neck diameter predicted a general preference for open surgical repair. Increasing proximal neck diameter predicted a larger discrepancy between our panel and the historical operation. Greater discrepancy was associated with adverse outcome. Substantial disagreement existed among our panel, and an exploratory analysis of the effect of increasing discrepancy demonstrated an association with adverse perioperative outcome. An investigation of the effect of a thoracic aortic team with open and endovascular specialists is warranted.

    View details for PubMedID 29195571

  • Association of Playing High School Football With Cognition and Mental Health Later in Life JAMA NEUROLOGY Deshpande, S. K., Hasegawa, R. B., Rabinowitz, A. R., Whyte, J., Roan, C. L., Tabatabaei, A., Baiocchi, M., Karlawish, J. H., Master, C. L., Small, D. S. 2017; 74 (8): 909–18


    American football is the largest participation sport in US high schools and is a leading cause of concussion among adolescents. Little is known about the long-term cognitive and mental health consequences of exposure to football-related head trauma at the high school level.To estimate the association of playing high school football with cognitive impairment and depression at 65 years of age.A representative sample of male high school students who graduated from high school in Wisconsin in 1957 was studied. In this cohort study using data from the Wisconsin Longitudinal Study, football players were matched between March 1 and July 1, 2017, with controls along several baseline covariates such as adolescent IQ, family background, and educational level. For robustness, 3 versions of the control condition were considered: all controls, those who played a noncollision sport, and those who did not play any sport.Athletic participation in high school football.A composite cognition measure of verbal fluency and memory and attention constructed from results of cognitive assessments administered at 65 years of age. A modified Center for Epidemiological Studies' Depression Scale score was used to measure depression. Secondary outcomes include results of individual cognitive tests, anger, anxiety, hostility, and heavy use of alcohol.Among the 3904 men (mean [SD] age, 64.4 [0.8] years at time of primary outcome measurement) in the study, after matching and model-based covariate adjustment, compared with each control condition, there was no statistically significant harmful association of playing football with a reduced composite cognition score (-0.04 reduction in cognition vs all controls; 97.5% CI, -0.14 to 0.05) or an increased modified Center for Epidemiological Studies' Depression Scale depression score (-1.75 reduction vs all controls; 97.5% CI, -3.24 to -0.26). After adjustment for multiple testing, playing football did not have a significant adverse association with any of the secondary outcomes, such as the likelihood of heavy alcohol use at 65 years of age (odds ratio, 0.68; 95% CI, 0.32-1.43).Cognitive and depression outcomes later in life were found to be similar for high school football players and their nonplaying counterparts from mid-1950s in Wisconsin. The risks of playing football today might be different than in the 1950s, but for current athletes, this study provides information on the risk of playing sports today that have a similar risk of head trauma as high school football played in the 1950s.

    View details for DOI 10.1001/jamaneurol.2017.1317

    View details for Web of Science ID 000407688300010

    View details for PubMedID 28672325

    View details for PubMedCentralID PMC5710329

  • Evidence That Classroom-Based Behavioral Interventions Reduce Pregnancy-Related School Dropout Among Nairobi Adolescents HEALTH EDUCATION & BEHAVIOR Sarnquist, C., Sinclair, J., Mboya, B. O., Langat, N., Paiva, L., Halpern-Felsher, B., Golden, N. H., Maldonado, Y. A., Baiocchi, M. T. 2017; 44 (2): 297-303


    Purpose To evaluate the effect of behavioral, empowerment-focused interventions on the incidence of pregnancy-related school dropout among girls in Nairobi's informal settlements. Method Retrospective data on pregnancy-related school dropout from two cohorts were analyzed using a matched-pairs quasi-experimental design. The primary outcome was the change in the number of school dropouts due to pregnancy from 1 year before to 1 year after the interventions. Results Annual incidence of school dropout due to pregnancy decreased by 46% in the intervention schools (from 3.9% at baseline to 2.1% at follow-up), whereas the comparison schools remained essentially unchanged (p < .029). Sensitivity analysis shows that the findings are robust to small levels of unobserved bias. Conclusions Results suggest that these behavioral interventions significantly reduced the number of school dropouts due to pregnancy. As there are limited promising studies on behavioral interventions that decrease adolescent pregnancy in low-income settings, this intervention may be an important addition to this toolkit.

    View details for DOI 10.1177/1090198116657777

    View details for Web of Science ID 000398072000012

  • Mechanical or Biologic Prostheses for Aortic-Valve and Mitral-Valve Replacement. The New England journal of medicine Goldstone, A. B., Chiu, P. n., Baiocchi, M. n., Lingala, B. n., Patrick, W. L., Fischbein, M. P., Woo, Y. J. 2017; 377 (19): 1847–57


    In patients undergoing aortic-valve or mitral-valve replacement, either a mechanical or biologic prosthesis is used. Biologic prostheses have been increasingly favored despite limited evidence supporting this practice.We compared long-term mortality and rates of reoperation, stroke, and bleeding between inverse-probability-weighted cohorts of patients who underwent primary aortic-valve replacement or mitral-valve replacement with a mechanical or biologic prosthesis in California in the period from 1996 through 2013. Patients were stratified into different age groups on the basis of valve position (aortic vs. mitral valve).From 1996 through 2013, the use of biologic prostheses increased substantially for aortic-valve and mitral-valve replacement, from 11.5% to 51.6% for aortic-valve replacement and from 16.8% to 53.7% for mitral-valve replacement. Among patients who underwent aortic-valve replacement, receipt of a biologic prosthesis was associated with significantly higher 15-year mortality than receipt of a mechanical prosthesis among patients 45 to 54 years of age (30.6% vs. 26.4% at 15 years; hazard ratio, 1.23; 95% confidence interval [CI], 1.02 to 1.48; P=0.03) but not among patients 55 to 64 years of age. Among patients who underwent mitral-valve replacement, receipt of a biologic prosthesis was associated with significantly higher mortality than receipt of a mechanical prosthesis among patients 40 to 49 years of age (44.1% vs. 27.1%; hazard ratio, 1.88; 95% CI, 1.35 to 2.63; P<0.001) and among those 50 to 69 years of age (50.0% vs. 45.3%; hazard ratio, 1.16; 95% CI, 1.04 to 1.30; P=0.01). The incidence of reoperation was significantly higher among recipients of a biologic prosthesis than among recipients of a mechanical prosthesis. Patients who received mechanical valves had a higher cumulative incidence of bleeding and, in some age groups, stroke than did recipients of a biologic prosthesis.The long-term mortality benefit that was associated with a mechanical prosthesis, as compared with a biologic prosthesis, persisted until 70 years of age among patients undergoing mitral-valve replacement and until 55 years of age among those undergoing aortic-valve replacement. (Funded by the National Institutes of Health and the Agency for Healthcare Research and Quality.).

    View details for PubMedID 29117490

  • Encouraging Earthquake-Resistant Construction: A Randomized Controlled Trial in Nepal EARTHQUAKE SPECTRA Sanquini, A. M., Thapaliya, S. M., Wood, M. M., Baiocchi, M., Hilley, G. E. 2016; 32 (4): 1975-1988
  • Using ICU Congestion as a Natural Experiment. Critical care medicine Jopling, J. K., Baiocchi, M., Milstein, A. 2016; 44 (10): 1936-1937

    View details for DOI 10.1097/CCM.0000000000001932

    View details for PubMedID 27635484

  • A Behavior-Based Intervention That Prevents Sexual Assault: the Results of a Matched-Pairs, Cluster-Randomized Study in Nairobi, Kenya. Prevention science Baiocchi, M., Omondi, B., Langat, N., Boothroyd, D. B., Sinclair, J., Pavia, L., Mulinge, M., Githua, O., Golden, N. H., Sarnquist, C. 2016: -?


    The study's design was a cluster-randomized, matched-pairs, parallel trial of a behavior-based sexual assault prevention intervention in the informal settlements.The participants were primary school girls aged 10-16. Classroom-based interventions for girls and boys were delivered by instructors from the same settlements, at the same time, over six 2-h sessions. The girls' program had components of empowerment, gender relations, and self-defense. The boys' program promotes healthy gender norms. The control arm of the study received a health and hygiene curriculum. The primary outcome was the rate of sexual assault in the prior 12 months at the cluster level (school level). Secondary outcomes included the generalized self-efficacy scale, the distribution of number of times victims were sexually assaulted in the prior period, skills used, disclosure rates, and distribution of perpetrators. Difference-in-differences estimates are reported with bootstrapped confidence intervals.Fourteen schools with 3147 girls from the intervention group and 14 schools with 2539 girls from the control group were included in the analysis. We estimate a 3.7 % decrease, p = 0.03 and 95 % CI = (0.4, 8.0), in risk of sexual assault in the intervention group due to the intervention (initially 7.3 % at baseline). We estimate an increase in mean generalized self-efficacy score of 0.19 (baseline average 3.1, on a 1-4 scale), p = 0.0004 and 95 % CI = (0.08, 0.39).This innovative intervention that combined parallel training for young adolescent girls and boys in school settings showed significant reduction in the rate of sexual assault among girls in this population.

    View details for PubMedID 27562036

  • Health-related quality of life among veterans in addictions treatment: identifying behavioral targets for future intervention QUALITY OF LIFE RESEARCH Oppezzo, M. A., Michalek, A. K., Delucchi, K., Baiocchi, M. T., Barnett, P. G., Prochaska, J. J. 2016; 25 (8): 1949-1957


    US veterans report lower health-related quality of life (HRQoL) relative to the general population. Identifying behavioral factors related to HRQoL that are malleable to change may inform interventions to improve well-being in this vulnerable group.The current study sought to characterize HRQoL in a largely male sample of veterans in addictions treatment, both in relation to US norms and in association with five recommended health behavior practices: regularly exercising, managing stress, having good sleep hygiene, consuming fruits and vegetables, and being tobacco free.We assessed HRQoL with 250 veterans in addictions treatment (96 % male, mean age 53, range 24-77) using scales from four validated measures. Data reduction methods identified two principal components reflecting physical and mental HRQoL. Model testing of HRQoL associations with health behaviors adjusted for relevant demographic and treatment-related covariates.Compared to US norms, the sample had lower HRQoL scores. Better psychological HRQoL was associated with higher subjective social standing, absence of pain or trauma, lower alcohol severity, and monotonically with the sum of health behaviors (all p < 0.05). Specifically, psychological HRQoL was associated with regular exercise, stress management, and sleep hygiene. Regular exercise also related to better physical HRQoL. The models explained >40 % of the variance in HRQoL.Exercise, sleep hygiene, and stress management are strongly associated with HRQoL among veterans in addictions treatment. Future research is needed to test the effect of interventions for improving well-being in this high-risk group.

    View details for DOI 10.1007/s11136-016-1236-3

    View details for Web of Science ID 000380005900009

    View details for PubMedID 26886926

    View details for PubMedCentralID PMC4987154

  • Evidence That Classroom-Based Behavioral Interventions Reduce Pregnancy-Related School Dropout Among Nairobi Adolescents. Health education & behavior Sarnquist, C., Sinclair, J., Omondi Mboya, B., Langat, N., Paiva, L., Halpern-Felsher, B., Golden, N. H., Maldonado, Y. A., Baiocchi, M. T. 2016


    Purpose To evaluate the effect of behavioral, empowerment-focused interventions on the incidence of pregnancy-related school dropout among girls in Nairobi's informal settlements. Method Retrospective data on pregnancy-related school dropout from two cohorts were analyzed using a matched-pairs quasi-experimental design. The primary outcome was the change in the number of school dropouts due to pregnancy from 1 year before to 1 year after the interventions. Results Annual incidence of school dropout due to pregnancy decreased by 46% in the intervention schools (from 3.9% at baseline to 2.1% at follow-up), whereas the comparison schools remained essentially unchanged (p < .029). Sensitivity analysis shows that the findings are robust to small levels of unobserved bias. Conclusions Results suggest that these behavioral interventions significantly reduced the number of school dropouts due to pregnancy. As there are limited promising studies on behavioral interventions that decrease adolescent pregnancy in low-income settings, this intervention may be an important addition to this toolkit.

    View details for PubMedID 27486178

  • Likelihood of Unemployed Smokers vs Nonsmokers Attaining Reemployment in a One-Year Observational Study JAMA INTERNAL MEDICINE Prochaska, J. J., Michalek, A. K., Brown-Johnson, C., Daza, E. J., Baiocchi, M., Anzai, N., Rogers, A., Grigg, M., Chieng, A. 2016; 176 (5): 662-670


    Studies in the United States and Europe have found higher smoking prevalence among unemployed job seekers relative to employed workers. While consistent, the extant epidemiologic investigations of smoking and work status have been cross-sectional, leaving it underdetermined whether tobacco use is a cause or effect of unemployment.To examine differences in reemployment by smoking status in a 12-month period.An observational 2-group study was conducted from September 10, 2013, to August 15, 2015, in employment service settings in the San Francisco Bay Area (California). Participants were 131 daily smokers and 120 nonsmokers, all of whom were unemployed job seekers. Owing to the study's observational design, a propensity score analysis was conducted using inverse probability weighting with trimmed observations. Including covariates of time out of work, age, education, race/ethnicity, and perceived health status as predictors of smoking status.Reemployment at 12-month follow-up.Of the 251 study participants, 165 (65.7) were men, with a mean (SD) age of 48 (11) years; 96 participants were white (38.2%), 90 were black (35.9%), 24 were Hispanic (9.6%), 18 were Asian (7.2%), and 23 were multiracial or other race (9.2%); 78 had a college degree (31.1%), 99 were unstably housed (39.4%), 70 lacked reliable transportation (27.9%), 52 had a criminal history (20.7%), and 72 had received prior treatment for alcohol or drug use (28.7%). Smokers consumed a mean (SD) of 13.5 (8.2) cigarettes per day at baseline. At 12-month follow-up (217 participants retained [86.5%]), 60 of 108 nonsmokers (55.6%) were reemployed compared with 29 of 109 smokers (26.6%) (unadjusted risk difference, 0.29; 95% CI, 0.15-0.42). With 6% of analysis sample observations trimmed, the estimated risk difference indicated that nonsmokers were 30% (95% CI, 12%-48%) more likely on average to be reemployed at 1 year relative to smokers. Results of a sensitivity analysis with additional covariates of sex, stable housing, reliable transportation, criminal history, and prior treatment for alcohol or drug use (25.3% of observations trimmed) reduced the difference in employment attributed to smoking status to 24% (95% CI, 7%-39%), which was still a significant difference. Among those reemployed at 1 year, the average hourly wage for smokers was significantly lower (mean [SD], $15.10 [$4.68]) than for nonsmokers (mean [SD], $20.27 [$10.54]; F(1,86) = 6.50, P = .01).To our knowledge, this is the first study to prospectively track reemployment success by smoking status. Smokers had a lower likelihood of reemployment at 1 year and were paid significantly less than nonsmokers when reemployed. Treatment of tobacco use in unemployment service settings is worth testing for increasing reemployment success and financial well-being.

    View details for DOI 10.1001/jamainternmed.2016.0772

    View details for PubMedID 27065044

  • Peer Assessment Enhances Student Learning: The Results of a Matched Randomized Crossover Experiment in a College Statistics Class PLOS ONE Sun, D. L., Harris, N., Walther, G., Baiocchi, M. 2015; 10 (12)


    Feedback has a powerful influence on learning, but it is also expensive to provide. In large classes it may even be impossible for instructors to provide individualized feedback. Peer assessment is one way to provide personalized feedback that scales to large classes. Besides these obvious logistical benefits, it has been conjectured that students also learn from the practice of peer assessment. However, this has never been conclusively demonstrated. Using an online educational platform that we developed, we conducted an in-class matched-set, randomized crossover experiment with high power to detect small effects. We establish that peer assessment causes a small but significant gain in student achievement. Our study also demonstrates the potential of web-based platforms to facilitate the design of high-quality experiments to identify small effects that were previously not detectable.

    View details for DOI 10.1371/journal.pone.0143177

    View details for PubMedID 26683053

    View details for PubMedCentralID PMC4684290

  • Instrumental variable methods for causal inference. Statistics in medicine Baiocchi, M., Cheng, J., Small, D. S. 2014; 33 (13): 2297-2340


    A goal of many health studies is to determine the causal effect of a treatment or intervention on health outcomes. Often, it is not ethically or practically possible to conduct a perfectly randomized experiment, and instead, an observational study must be used. A major challenge to the validity of observational studies is the possibility of unmeasured confounding (i.e., unmeasured ways in which the treatment and control groups differ before treatment administration, which also affect the outcome). Instrumental variables analysis is a method for controlling for unmeasured confounding. This type of analysis requires the measurement of a valid instrumental variable, which is a variable that (i) is independent of the unmeasured confounding; (ii) affects the treatment; and (iii) affects the outcome only indirectly through its effect on the treatment. This tutorial discusses the types of causal effects that can be estimated by instrumental variables analysis; the assumptions needed for instrumental variables analysis to provide valid estimates of causal effects and sensitivity analysis for those assumptions; methods of estimation of causal effects using instrumental variables; and sources of instrumental variables in health studies.

    View details for DOI 10.1002/sim.6128

    View details for PubMedID 24599889

  • Near/far matching: a study design approach to instrumental variables. Health services & outcomes research methodology Baiocchi, M., Small, D. S., Yang, L., Polsky, D., Groeneveld, P. W. 2012; 12 (4): 237-253


    Classic instrumental variable techniques involve the use of structural equation modeling or other forms of parameterized modeling. In this paper we use a nonparametric, matching-based instrumental variable methodology that is based on a study design approach. Similar to propensity score matching, though unlike classic instrumental variable approaches, near/far matching is capable of estimating causal effects when the outcome is not continuous. Unlike propensity score matching, though similar to instrumental variable techniques, near/far matching is also capable of estimating causal effects even when unmeasured covariates produce selection bias. We illustrate near/far matching by using Medicare data to compare the effectiveness of carotid arterial stents with cerebral protection versus carotid endarterectomy for the treatment of carotid stenosis.

    View details for PubMedID 27087781

  • The Differential Impact of Delivery Hospital on the Outcomes of Premature Infants PEDIATRICS Lorch, S. A., Baiocchi, M., Ahlberg, C. E., Small, D. S. 2012; 130 (2): 270-278


    Because greater percentages of women deliver at hospitals without high-level NICUs, there is little information on the effect of delivery hospital on the outcomes of premature infants in the past 2 decades, or how these effects differ across states with different perinatal regionalization systems.A retrospective population-based cohort study was constructed of all hospital-based deliveries in Pennsylvania and California between 1995 and 2005 and Missouri between 1995 and 2003 with a gestational age between 23 and 37 weeks (N = 1328132). The effect of delivery at a high-level NICU on in-hospital death and 5 complications of premature birth was calculated by using an instrumental variables approach to control for measured and unmeasured differences between hospitals.Infants who were delivered at a high-level NICU had significantly fewer in-hospital deaths in Pennsylvania (7.8 fewer deaths/1000 deliveries, 95% confidence interval [CI] 4.1-11.5), California (2.7 fewer deaths/1000 deliveries, 95% CI 0.9-4.5), and Missouri (12.6 fewer deaths/1000 deliveries, 95% CI 2.6-22.6). Deliveries at high-level NICUs had similar rates of most complications, with the exception of lower bronchopulmonary dysplasia rates at Missouri high-level NICUs (9.5 fewer cases/1000 deliveries, 95% CI 0.7-18.4) and higher infection rates at high-level NICUs in Pennsylvania and California. The association between delivery hospital, in-hospital mortality, and complications differed across the 3 states.There is benefit to neonatal outcomes when high-risk infants are delivered at high-level NICUs that is larger than previously reported, although the effects differ between states, which may be attributable to different methods of regionalization.

    View details for DOI 10.1542/peds.2011-2820

    View details for Web of Science ID 000307123000047

    View details for PubMedID 22778301