Bio


I obtained my PhD in genetic epidemiology at Queensland University of Technology (Australia), where my research was focused on using genetic and genomic approaches to identify risk factors for endometrial cancer. During my graduate studies, I gained experience in large-scale genetic association studies and leveraging the correlation between diseases in genetic studies to identify novel genetic variants associated with endometrial cancer. I also developed expertise in various statistical genetic approaches in multi-omics data, including fine-mapping and colocalization analyses, to prioritize candidate causal variants and genes. I also gained extensive experience in genetic causal inference analysis to infer causality between risk factors and health outcomes.

My research focus since moving to Stanford has been the identification of genetic and non-genetic determinants of cardiometabolic diseases. I am currently involved in projects including large-scale genetic association studies, multi-trait analysis with correlated traits, development and validation of polygenic risk scores, integrative analyses with multi-omics data, as well as Mendelian randomization analyses to advance our understanding of the genetic and environmental factors that contribute to cardiometabolic diseases.

Honors & Awards


  • 2021 QUT Outstanding Doctoral Thesis Award, Queensland University of Technology (Australia) (2022)
  • QIMR Berghofer PhD Top-up Scholarship, QIMR Berghofer Medical Research Institute, Australia (2018 - 2020)
  • Biomed Link Travel Grant, University of Melbourne, Australia (2018)
  • Australian Government Research Training Program (RTP) Scholarship, Australian Government (2017-2020)
  • QIMR Berghofer Masters (Coursework) Scholarship, QIMR Berghofer Medical Research Institute, Australia (2015)

Stanford Advisors


All Publications


  • Plasma proteomics and carotid intima-media thickness in the UK biobank cohort. Frontiers in cardiovascular medicine Chen, M. L., Kho, P. F., Guarischi-Sousa, R., Zhou, J., Panyard, D. J., Azizi, Z., Gupte, T., Watson, K., Abbasi, F., Assimes, T. L. 2024; 11: 1478600

    Abstract

    Ultrasound derived carotid intima-media thickness (cIMT) is valuable for cardiovascular risk stratification. We assessed the relative importance of traditional atherosclerosis risk factors and plasma proteins in predicting cIMT measured nearly a decade later.We examined 6,136 UK Biobank participants with 1,461 proteins profiled using the proximity extension assay applied to their baseline blood draw who subsequently underwent a cIMT measurement. We implemented linear regression, stepwise Akaike Information Criterion-based, and the least absolute shrinkage and selection operator (LASSO) models to identify potential proteomic as well as non-proteomic predictors. We evaluated our model performance using the proportion variance explained (R 2).The mean time from baseline assessment to cIMT measurement was 9.2 years. Age, blood pressure, and anthropometric related variables were the strongest predictors of cIMT with fat-free mass index of the truncal region being the strongest predictor among adiposity measurements. A LASSO model incorporating variables including age, assessment center, genetic risk factors, smoking, blood pressure, trunk fat-free mass index, apolipoprotein B, and Townsend deprivation index combined with 97 proteins achieved the highest R 2 (0.308, 95% C.I. 0.274, 0.341). In contrast, models built with proteins alone or non-proteomic variables alone explained a notably lower R 2 (0.261, 0.228-0.294 and 0.260, 0.226-0.293, respectively). Chromogranin b (CHGB), Cystatin-M/E (CST6), leptin (LEP), and prolargin (PRELP) were the proteins consistently selected across all models.Plasma proteins add to the clinical and genetic risk factors in predicting a cIMT measurement. Our findings implicate blood pressure and extracellular matrix-related proteins in cIMT pathophysiology.

    View details for DOI 10.3389/fcvm.2024.1478600

    View details for PubMedID 39416432

    View details for PubMedCentralID PMC11480011

  • A plasma proteomic signature for atherosclerotic cardiovascular disease risk prediction in the UK Biobank cohort. medRxiv : the preprint server for health sciences Gupte, T. P., Azizi, Z., Kho, P. F., Zhou, J., Chen, M., Panyard, D. J., Guarischi-Sousa, R., Hilliard, A. T., Sharma, D., Watson, K., Abbasi, F., Tsao, P. S., Clarke, S. L., Assimes, T. L. 2024

    Abstract

    Background: While risk stratification for atherosclerotic cardiovascular disease (ASCVD) is essential for primary prevention, current clinical risk algorithms demonstrate variability and leave room for further improvement. The plasma proteome holds promise as a future diagnostic and prognostic tool that can accurately reflect complex human traits and disease processes. We assessed the ability of plasma proteins to predict ASCVD.Method: Clinical, genetic, and high-throughput plasma proteomic data were analyzed for association with ASCVD in a cohort of 41,650 UK Biobank participants. Selected features for analysis included clinical variables such as a UK-based cardiovascular clinical risk score (QRISK3) and lipid levels, 36 polygenic risk scores (PRSs), and Olink protein expression data of 2,920 proteins. We used least absolute shrinkage and selection operator (LASSO) regression to select features and compared area under the curve (AUC) statistics between data types. Randomized LASSO regression with a stability selection algorithm identified a smaller set of more robustly associated proteins. The benefit of plasma proteins over standard clinical variables, the QRISK3 score, and PRSs was evaluated through the derivation of Delta AUC values. We also assessed the incremental gain in model performance using proteomic datasets with varying numbers of proteins. To identify potential causal proteins for ASCVD, we conducted a two-sample Mendelian randomization (MR) analysis.Result: The mean age of our cohort was 56.0 years, 60.3% were female, and 9.8% developed incident ASCVD over a median follow-up of 6.9 years. A protein-only LASSO model selected 294 proteins and returned an AUC of 0.723 (95% CI 0.708-0.737). A clinical variable and PRS-only LASSO model selected 4 clinical variables and 20 PRSs and achieved an AUC of 0.726 (95% CI 0.712-0.741). The addition of the full proteomic dataset to clinical variables and PRSs resulted in a Delta AUC of 0.010 (95% CI 0.003-0.018). Fifteen proteins selected by a stability selection algorithm offered improvement in ASCVD prediction over the QRISK3 risk score [Delta AUC: 0.013 (95% CI 0.005-0.021)]. Filtered and clustered versions of the full proteomic dataset (consisting of 600-1,500 proteins) performed comparably to the full dataset for ASCVD prediction. Using MR, we identified 11 proteins as potentially causal for ASCVD.Conclusion: A plasma proteomic signature performs well for incident ASCVD prediction but only modestly improves prediction over clinical and genetic factors. Further studies are warranted to better elucidate the clinical utility of this signature in predicting the risk of ASCVD over the standard practice of using the QRISK3 score.

    View details for DOI 10.1101/2024.09.13.24313652

    View details for PubMedID 39314942

  • Plasma proteomic signatures for type 2 diabetes mellitus and related traits in the UK Biobank cohort. medRxiv : the preprint server for health sciences Gupte, T. P., Azizi, Z., Kho, P. F., Zhou, J., Nzenkue, K., Chen, M., Panyard, D. J., Guarischi-Sousa, R., Hilliard, A. T., Sharma, D., Watson, K., Abbasi, F., Tsao, P. S., Clarke, S. L., Assimes, T. L. 2024

    Abstract

    Aims/hypothesis: The plasma proteome holds promise as a diagnostic and prognostic tool that can accurately reflect complex human traits and disease processes. We assessed the ability of plasma proteins to predict type 2 diabetes mellitus (T2DM) and related traits.Methods: Clinical, genetic, and high-throughput proteomic data from three subcohorts of UK Biobank participants were analyzed for association with dual-energy x-ray absorptiometry (DXA) derived truncal fat (in the adiposity subcohort), estimated maximum oxygen consumption (VO 2 max) (in the fitness subcohort), and incident T2DM (in the T2DM subcohort). We used least absolute shrinkage and selection operator (LASSO) regression to assess the relative ability of non-proteomic and proteomic variables to associate with each trait by comparing variance explained (R 2 ) and area under the curve (AUC) statistics between data types. Stability selection with randomized LASSO regression identified the most robustly associated proteins for each trait. The benefit of proteomic signatures (PSs) over QDiabetes, a T2DM clinical risk score, was evaluated through the derivation of delta (Delta) AUC values. We also assessed the incremental gain in model performance metrics using proteomic datasets with varying numbers of proteins. A series of two-sample Mendelian randomization (MR) analyses were conducted to identify potentially causal proteins for adiposity, fitness, and T2DM.Results: Across all three subcohorts, the mean age was 56.7 years and 54.9% were female. In the T2DM subcohort, 5.8% developed incident T2DM over a median follow-up of 7.6 years. LASSO-derived PSs increased the R 2 of truncal fat and VO 2 max over clinical and genetic factors by 0.074 and 0.057, respectively. We observed a similar improvement in T2DM prediction over the QDiabetes score [Delta AUC: 0.016 (95% CI 0.008, 0.024)] when using a robust PS derived strictly from the T2DM outcome versus a model further augmented with non-overlapping proteins associated with adiposity and fitness. A small number of proteins (29 for truncal adiposity, 18 for VO2max, and 26 for T2DM) identified by stability selection algorithms offered most of the improvement in prediction of each outcome. Filtered and clustered versions of the full proteomic dataset supplied by the UK Biobank (ranging between 600-1,500 proteins) performed comparably to the full dataset for T2DM prediction. Using MR, we identified 4 proteins as potentially causal for adiposity, 1 as potentially causal for fitness, and 4 as potentially causal for T2DM.Conclusions/Interpretation: Plasma PSs modestly improve the prediction of incident T2DM over that possible with clinical and genetic factors. Further studies are warranted to better elucidate the clinical utility of these signatures in predicting the risk of T2DM over the standard practice of using the QDiabetes score. Candidate causally associated proteins identified through MR deserve further study as potential novel therapeutic targets for T2DM.

    View details for DOI 10.1101/2024.09.13.24313501

    View details for PubMedID 39314935

  • Associations between accurate measures of adiposity and fitness, blood proteins, and insulin sensitivity among South Asians and Europeans. medRxiv : the preprint server for health sciences Kho, P. F., Stell, L., Jimenez, S., Zanetti, D., Panyard, D. J., Watson, K. L., Sarraju, A., Chen, M. L., Lind, L., Petrie, J. R., Chan, K. N., Fonda, H., Kent, K., Myers, J. N., Palaniappan, L., Abbasi, F., Assimes, T. L. 2024

    Abstract

    South Asians (SAs) may possess a unique predisposition to insulin resistance (IR). We explored this possibility by investigating the relationship between 'gold standard' measures of adiposity, fitness, selected proteomic biomarkers, and insulin sensitivity among a cohort of SAs and Europeans (EURs).A total of 46 SAs and 41 EURs completed 'conventional' (lifestyle questionnaires, standard physical exam) as well as 'gold standard' (dual energy X-ray absorptiometry scan, cardiopulmonary exercise test, and insulin suppression test) assessments of adiposity, fitness, and insulin sensitivity. In a subset of 28 SAs and 36 EURs, we also measured the blood-levels of eleven IR-related proteins. We conducted Spearman correlation to identify correlates of steady-state plasma glucose (SSPG) derived from the insulin suppression test, followed by multivariable linear regression analyses of SSPG, adjusting for age, sex and ancestral group.Sixteen of 30 measures significantly associated with SSPG, including one conventional and eight gold standard measures of adiposity, one conventional and one gold standard measure of fitness, and five proteins. Multivariable regressions revealed that gold standard measures and plasma proteins attenuated ancestral group differences in IR, suggesting their potential utility in assessing IR, especially among SAs.Ancestral group differences in IR may be explained by accurate measures of adiposity and fitness, with specific proteins possibly serving as useful surrogates for these measures, particularly for SAs.

    View details for DOI 10.1101/2024.09.06.24313199

    View details for PubMedID 39281745

    View details for PubMedCentralID PMC11398600

  • CXCL12 regulates coronary artery dominance in diverse populations and links development to disease. medRxiv : the preprint server for health sciences Rios Coronado, P. E., Zanetti, D., Zhou, J., Naftaly, J. A., Prabala, P., Kho, P. F., Martínez Jaimes, A. M., Hilliard, A. T., Pyarajan, S., Dochtermann, D., Chang, K. M., Winn, V. D., Pașca, A. M., Plomondon, M. E., Waldo, S. W., Tsao, P. S., Clarke, S. L., Red-Horse, K., Assimes, T. L. 2023

    Abstract

    Mammalian cardiac muscle is supplied with blood by right and left coronary arteries that form branches covering both ventricles of the heart. Whether branches of the right or left coronary arteries wrap around to the inferior side of the left ventricle is variable in humans and termed right or left dominance. Coronary dominance is likely a heritable trait, but its genetic architecture has never been explored. Here, we present the first large-scale multi-ancestry genome-wide association study of dominance in 61,043 participants of the VA Million Veteran Program, including over 10,300 Africans and 4,400 Admixed Americans. Dominance was moderately heritable with ten loci reaching genome wide significance. The most significant mapped to the chemokine CXCL12 in both Europeans and Africans. Whole-organ imaging of human fetal hearts revealed that dominance is established during development in locations where CXCL12 is expressed. In mice, dominance involved the septal coronary artery, and its patterning was altered with Cxcl12 deficiency. Finally, we linked human dominance patterns with coronary artery disease through colocalization, genome-wide genetic correlation and Mendelian Randomization analyses. Together, our data supports CXCL12 as a primary determinant of coronary artery dominance in humans of diverse backgrounds and suggests that developmental patterning of arteries may influence one's susceptibility to ischemic heart disease.

    View details for DOI 10.1101/2023.10.27.23297507

    View details for PubMedID 37961706

    View details for PubMedCentralID PMC10635223

  • Proteomic analysis of 92 circulating proteins and their effects in cardiometabolic diseases. Clinical proteomics Carland, C., Png, G., Malarstig, A., Kho, P. F., Gustafsson, S., Michaelsson, K., Lind, L., Tsafantakis, E., Karaleftheri, M., Dedoussis, G., Ramisch, A., Macdonald-Dunlop, E., Klaric, L., Joshi, P. K., Chen, Y., Björck, H. M., Eriksson, P., Carrasco-Zanini, J., Wheeler, E., Suhre, K., Gilly, A., Zeggini, E., Viñuela, A., Dermitzakis, E. T., Wilson, J. F., Langenberg, C., Thareja, G., Halama, A., Schmidt, F., Zanetti, D., Assimes, T. 2023; 20 (1): 31

    Abstract

    Human plasma contains a wide variety of circulating proteins. These proteins can be important clinical biomarkers in disease and also possible drug targets. Large scale genomics studies of circulating proteins can identify genetic variants that lead to relative protein abundance.We conducted a meta-analysis on genome-wide association studies of autosomal chromosomes in 22,997 individuals of primarily European ancestry across 12 cohorts to identify protein quantitative trait loci (pQTL) for 92 cardiometabolic associated plasma proteins.We identified 503 (337 cis and 166 trans) conditionally independent pQTLs, including several novel variants not reported in the literature. We conducted a sex-stratified analysis and found that 118 (23.5%) of pQTLs demonstrated heterogeneity between sexes. The direction of effect was preserved but there were differences in effect size and significance. Additionally, we annotate trans-pQTLs with nearest genes and report plausible biological relationships. Using Mendelian randomization, we identified causal associations for 18 proteins across 19 phenotypes, of which 10 have additional genetic colocalization evidence. We highlight proteins associated with a constellation of cardiometabolic traits including angiopoietin-related protein 7 (ANGPTL7) and Semaphorin 3F (SEMA3F).Through large-scale analysis of protein quantitative trait loci, we provide a comprehensive overview of common variants associated with plasma proteins. We highlight possible biological relationships which may serve as a basis for further investigation into possible causal roles in cardiometabolic diseases.

    View details for DOI 10.1186/s12014-023-09421-0

    View details for PubMedID 37550624

    View details for PubMedCentralID PMC10405520

  • Contemporary Polygenic Scores of Low-Density Lipoprotein Cholesterol and Coronary Artery Disease Predict Coronary Atherosclerosis in Adolescents and Young Adults. Circulation. Genomic and precision medicine Guarischi-Sousa, R., Salfati, E., Kho, P. F., Iyer, K. R., Hilliard, A. T., Herrington, D. M., Tsao, P. S., Clarke, S. L., Assimes, T. L. 2023: e004047

    View details for DOI 10.1161/CIRCGEN.122.004047

    View details for PubMedID 37409455

  • Genetic impact of blood C-reactive protein levels on chronic spinal & widespread pain. European spine journal : official publication of the European Spine Society, the European Spinal Deformity Society, and the European Section of the Cervical Spine Research Society Farrell, S. F., Sterling, M., Klyne, D. M., Mustafa, S., Campos, A. I., Kho, P. F., Lundberg, M., Rentería, M. E., Ngo, T. T., Cuéllar-Partida, G. 2023

    Abstract

    Causal mechanisms underlying systemic inflammation in spinal & widespread pain remain an intractable experimental challenge. Here we examined whether: (i) associations between blood C-reactive protein (CRP) and chronic back, neck/shoulder & widespread pain can be explained by shared underlying genetic variants; and (ii) higher CRP levels causally contribute to these conditions.Using genome-wide association studies (GWAS) of chronic back, neck/shoulder & widespread pain (N = 6063-79,089 cases; N = 239,125 controls) and GWAS summary statistics for blood CRP (Pan-UK Biobank N = 400,094 & PAGE consortium N = 28,520), we employed cross-trait bivariate linkage disequilibrium score regression to determine genetic correlations (rG) between these chronic pain phenotypes and CRP levels (FDR < 5%). Latent causal variable (LCV) and generalised summary data-based Mendelian randomisation (GSMR) analyses examined putative causal associations between chronic pain & CRP (FDR < 5%).Higher CRP levels were genetically correlated with chronic back, neck/shoulder & widespread pain (rG range 0.26-0.36; P ≤ 8.07E-9; 3/6 trait pairs). Although genetic causal proportions (GCP) did not explain this finding (GCP range - 0.32-0.08; P ≥ 0.02), GSMR demonstrated putative causal effects of higher CRP levels contributing to each pain type (beta range 0.027-0.166; P ≤ 9.82E-03; 3 trait pairs) as well as neck/shoulder pain effects on CRP levels (beta [S.E.] 0.030 [0.021]; P = 6.97E-04).This genetic evidence for higher CRP levels in chronic spinal (back, neck/shoulder) & widespread pain warrants further large-scale multimodal & prospective longitudinal studies to accelerate the identification of novel translational targets and more effective therapeutic strategies.

    View details for DOI 10.1007/s00586-023-07711-7

    View details for PubMedID 37069442

    View details for PubMedCentralID 8213433

  • Discovery of genomic loci associated with sleep apnoea risk through multi-trait GWAS analysis with snoring. Sleep Campos, A. I., Ingold, N., Huang, Y., Mitchell, B. L., Kho, P. F., Han, X., García-Marín, L. M., Ong, J. S., Law, M. H., Yokoyama, J. S., Martin, N. G., Dong, X., Cuellar-Partida, G., MacGregor, S., Aslibekyan, S., Rentería, M. E. 2022

    Abstract

    Despite its association with severe health conditions, the aetiology of sleep apnoea remains understudied. This study sought to identify genetic variants robustly associated with sleep apnoea risk.We performed a genome-wide association study (GWAS) meta-analysis of sleep apnoea across five cohorts (NTotal=523,366), followed by a multi-trait analysis of GWAS (MTAG) to boost power, leveraging the high genetic correlation between sleep apnoea and snoring. We then adjusted our results for the genetic effects of body mass index (BMI) using multi-trait-based conditional & joint analysis (mtCOJO) and sought replication of lead hits in a large cohort of participants from 23andMe, Inc (NTotal=1,477,352; Ncases=175,522). We also explored genetic correlations with other complex traits and performed a phenome-wide screen for causally associated phenotypes using the latent causal variable method.Our sleep apnoea meta-analysis identified five independent variants with evidence of association beyond genome-wide significance. After adjustment for BMI, only one genome-wide significant variant was identified. MTAG analyses uncovered 49 significant independent loci associated with sleep apnoea risk. Twenty-nine variants were replicated in the 23andMe GWAS adjusting for BMI. We observed genetic correlations with several complex traits, including multisite chronic pain, diabetes, eye disorders, high blood pressure, osteoarthritis, chronic obstructive pulmonary disease, and BMI-associated conditions.Our study uncovered multiple genetic loci associated with sleep apnoea risk, thus increasing our understanding of the aetiology of this condition and its relationship with other complex traits.

    View details for DOI 10.1093/sleep/zsac308

    View details for PubMedID 36525587

  • A shared genetic signature for common chronic pain conditions and its impact on biopsychosocial traits. The journal of pain Farrell, S. F., Kho, P., Lundberg, M., Campos, A. I., Renteria, M. E., de Zoete, R. M., Sterling, M., Ngo, T. T., Cuellar-Partida, G. 2022

    Abstract

    The multiple comorbidities & dimensions of chronic pain present a formidable challenge in disentangling its aetiology. Here, we performed genome-wide association studies of eight chronic pain types using UK Biobank data (N=4,037-79,089 cases; N=239,125 controls), followed by bivariate linkage disequilibrium-score regression and latent causal variable analyses to determine (respectively) their genetic correlations and genetic causal proportion (GCP) parameters with 1,492 other complex traits. We report evidence of a shared genetic signature across chronic pain types as their genetic correlations and GCP directions were broadly consistent across an array of biopsychosocial traits. Across 5,942 significant genetic correlations, 570 trait pairs could be explained by a causal association (|GCP| > 0.6; 5% false discovery rate), including 82 traits affected by pain while 410 contributed to an increased risk of chronic pain (cf. 78 with a decreased risk) such as certain somatic pathologies (e.g., musculoskeletal), psychiatric traits (e.g., depression), socioeconomic factors (e.g., occupation) and medical comorbidities (e.g., cardiovascular disease). This data-driven phenome-wide association analysis has demonstrated a novel and efficient strategy for identifying genetically supported risk & protective traits to enhance the design of interventional trials targeting underlying causal factors and accelerate the development of more effective treatments with broader clinical utility. PERSPECTIVE: Through large-scale phenome-wide association analyses of >1,400 biopsychosocial traits, this article provides evidence for a shared genetic signature across eight common chronic pain types. It lays the foundation for further translational studies focused on identifying causal genetic variants and pathophysiological pathways to develop novel diagnostic & therapeutic technologies and strategies.

    View details for DOI 10.1016/j.jpain.2022.10.005

    View details for PubMedID 36252619

  • Dehydroepiandrosterone Sulfate and Colorectal Cancer Risk: A Mendelian Randomization Analysis. Twin research and human genetics : the official journal of the International Society for Twin Studies Jayarathna, D. K., Renteria, M. E., Kho, P. F., Batra, J., Gandhi, N. S. 2022: 1-7

    Abstract

    Colorectal cancer is the third most common and second most deadly type of cancer worldwide, with approximately 1.9 million cases and 0.9 million deaths worldwide in 2020. Previous studies have shown that estrogen and testosterone hormones are associated with colorectal cancer risk and mortality. However, the potential effect of their precursor, dehydroepiandrosterone sulfate (DHEAS), on colorectal cancer risk has not been investigated. Therefore, evaluating DHEAS's effect on colorectal cancer will expand our understanding of the hormonal contribution to colorectal cancer risk. In this study, we conducted a two-sample Mendelian randomization (MR) analysis to investigate the causal effect of DHEAS on colorectal cancer. We obtained DHEAS and colorectal cancer genomewide association study (GWAS) summary statistics from the Leipzig Health Atlas and the GWAS catalog and conducted MR analyses using the TwoSampleMR R package. Our results suggest that higher DHEAS levels are causally associated with decreased colorectal cancer risk (odds ratio per unit increase in DHEAS levels z score = 0.70; 95% confidence interval [0.51, 0.96]), which is in line with previous observations in a case-control study of colon cancer. The outcome of this study will be beneficial in developing plasma DHEAS-based biomarkers in colorectal cancer. Further studies should be conducted to interpret the DHEAS-colorectal cancer association among different ancestries and populations.

    View details for DOI 10.1017/thg.2022.31

    View details for PubMedID 36053043