Boards, Advisory Committees, Professional Organizations


  • Reviewer, Journal of the American Medical Informatics Association (2024 - Present)
  • Editorial Board: Academic Editor, PeerJ Computation Science (2024 - Present)
  • Trusted Reviewer Board, Health and Quality of Life Outcomes (2024 - Present)
  • Reviewer, npj Digital Medicine (2024 - Present)
  • Reviewer, Circulation: Genomic and Precision Medicine (2023 - Present)
  • Reviewer, BMC Journals (2023 - Present)
  • Reviewer, Scientific Reports (2023 - Present)
  • Reviewer, Journal of Orthopaedic Surgery and Research (2023 - Present)
  • Reviewer, European Journal of Medical Research (2023 - Present)
  • Reviewer, Journal of Cancer Research and Clinical Oncology (2023 - Present)
  • Reviewer, Frontiers in Journals (2023 - Present)
  • Reviewer, Analytical Cellular Pathology (2022 - Present)
  • Reviewer, Evidence-Based Complementary and Alternative Medicine (2022 - Present)

Professional Education


  • Doctor of Philosophy, The Pennsylvania State University, Pathobiology (Bioinformatics and Human Genetics) (2023)
  • Master of Applied Statistics, The Pennsylvania State University, Applied Statistics (2022)
  • Bachelor of Science, The Pennsylvania State University, Biochemistry and Molecular Biology (2018)
  • Bachelor of Science, The Pennsylvania State University, Immunology and Infectious Diseases (2018)

Stanford Advisors


All Publications


  • Large Language Models in Biomedical and Health Informatics: A Review with Bibliometric Analysis. Journal of healthcare informatics research Yu, H., Fan, L., Li, L., Zhou, J., Ma, Z., Xian, L., Hua, W., He, S., Jin, M., Zhang, Y., Gandhi, A., Ma, X. 2024; 8 (4): 658-711

    Abstract

    Large language models (LLMs) have rapidly become important tools in Biomedical and Health Informatics (BHI), potentially enabling new ways to analyze data, treat patients, and conduct research. This study aims to provide a comprehensive overview of LLM applications in BHI, highlighting their transformative potential and addressing the associated ethical and practical challenges. We reviewed 1698 research articles from January 2022 to December 2023, categorizing them by research themes and diagnostic categories. Additionally, we conducted network analysis to map scholarly collaborations and research dynamics. Our findings reveal a substantial increase in the potential applications of LLMs to a variety of BHI tasks, including clinical decision support, patient interaction, and medical document analysis. Notably, LLMs are expected to be instrumental in enhancing the accuracy of diagnostic tools and patient care protocols. The network analysis highlights dense and dynamically evolving collaborations across institutions, underscoring the interdisciplinary nature of LLM research in BHI. A significant trend was the application of LLMs in managing specific disease categories, such as mental health and neurological disorders, demonstrating their potential to influence personalized medicine and public health strategies. LLMs hold promising potential to further transform biomedical research and healthcare delivery. While promising, the ethical implications and challenges of model validation call for rigorous scrutiny to optimize their benefits in clinical settings. This survey serves as a resource for stakeholders in healthcare, including researchers, clinicians, and policymakers, to understand the current state and future potential of LLMs in BHI.

    View details for DOI 10.1007/s41666-024-00171-8

    View details for PubMedID 39463859

    View details for PubMedCentralID PMC11499577

  • Insights into Metabolic Reprogramming in Tumor Evolution and Therapy. Cancers Chiu, C. F., Guerrero, J. J., Regalado, R. R., Zamora, M. J., Zhou, J., Notarte, K. I., Lu, Y. W., Encarnacion, P. C., Carles, C. D., Octavo, E. M., Limbaroc, D. C., Saengboonmee, C., Huang, S. Y. 2024; 16 (20)

    Abstract

    Background: Cancer remains a global health challenge, characterized not just by uncontrolled cell proliferation but also by the complex metabolic reprogramming that underlies its development and progression. Objectives: This review delves into the intricate relationship between cancer and its metabolic alterations, drawing an innovative comparison with the cosmological concepts of dark matter and dark energy to highlight the pivotal yet often overlooked role of metabolic reprogramming in tumor evolution. Methods: It scrutinizes the Warburg effect and other metabolic adaptations, such as shifts in lipid synthesis, amino acid turnover, and mitochondrial function, driven by mutations in key regulatory genes. Results: This review emphasizes the significance of targeting these metabolic pathways for therapeutic intervention, outlining the potential to disrupt cancer's energy supply and signaling mechanisms. It calls for an interdisciplinary research approach to fully understand and exploit the intricacies of cancer metabolism, pointing toward metabolic reprogramming as a promising frontier for developing more effective cancer treatments. Conclusion: By equating cancer's metabolic complexity with the enigmatic nature of dark matter and energy, this review underscores the critical need for innovative strategies in oncology, highlighting the importance of unveiling and targeting the "dark energy" within cancer cells to revolutionize future therapy and research.

    View details for DOI 10.3390/cancers16203513

    View details for PubMedID 39456607

    View details for PubMedCentralID PMC11506062

  • PAGER: A novel genotype encoding strategy for modeling deviations from additivity in complex trait association studies. BioData mining Freda, P. J., Ghosh, A., Bhandary, P., Matsumoto, N., Chitre, A. S., Zhou, J., Hall, M. A., Palmer, A. A., Obafemi-Ajayi, T., Moore, J. H. 2024; 17 (1): 41

    Abstract

    The additive model of inheritance assumes that heterozygotes (Aa) are exactly intermediate in respect to homozygotes (AA and aa). While this model is commonly used in single-locus genetic association studies, significant deviations from additivity are well-documented and contribute to phenotypic variance across many traits and systems. This assumption can introduce type I and type II errors by overestimating or underestimating the effects of variants that deviate from additivity. Alternative genotype encoding strategies have been explored to account for different inheritance patterns, but they often incur significant computational or methodological costs. To address these challenges, we introduce PAGER (Phenotype Adjusted Genotype Encoding and Ranking), an efficient pre-processing method that encodes each genetic variant based on normalized mean phenotypic differences between diallelic genotype classes (AA, Aa, and aa). This approach more accurately reflects each variant's true inheritance model, improving model precision while minimizing the costs associated with alternative encoding strategies.Through extensive benchmarking on SNPs simulated with both binary and continuous phenotypes, we demonstrate that PAGER accurately represents various inheritance patterns (including additive, dominant, recessive, and heterosis), achieves levels of statistical power that meet or exceed other encoding strategies, and attains computation speeds up to 55 times faster than a similar method, EDGE. We also apply PAGER to publicly available real-world data and identify a novel, relevant putative QTL associated with body mass index in rats (Rattus norvegicus) that is not detected with the additive model.Overall, we show that PAGER is an efficient genotype encoding approach that can uncover sources of missing heritability and reveal novel insights in the study of complex traits while incurring minimal costs.

    View details for DOI 10.1186/s13040-024-00393-x

    View details for PubMedID 39394173

    View details for PubMedCentralID PMC11468469

  • Per- and poly-fluoroalkyl substances (PFAS) accelerate biological aging mediated by increased C-reactive protein. Journal of hazardous materials Zhao, Z., Zhou, J., Shi, A., Wang, J., Li, H., Yin, X., Gao, J., Wu, Y., Li, J., Sun, Y. X., Yan, H., Li, Y., Chen, G. 2024; 480: 136090

    Abstract

    Unhealthy biological aging is related to higher incidence of varied age-related diseases, even higher all-cause mortality. Previous small sample size study suggested that Per- and poly-fluoroalkyl substances (PFAS) was associated with biological aging, but the evidence of exposure-response relationships, potential effect modifiers, and potential mediators were not investigated. Therefore, we conducted a cross-sectional analysis of national study including 14, 865 adults in the US from 8 survey cycles of NHANES from 2003 to 2018, to investigate the associations of PFAS compounds in body serum, including perfluorooctanoic acid (PFOA), perfluorooctane sulfonic acid (PFOS), perfluorononanoic acid (PFNA), and perfluorohexane sulfonic acid (PFHxS), with biological aging. Generalized linear models showed that higher human exposure to PFAS was associated with accelerated biological aging. Importantly, human exposure to PFOA, PFOS, PFNA, and PFHxS with detected level (above 0.10 ng/mL) was associated with an average of 3.3 year (95 %CI: 2.7, 3.9, P < 0.001), 14.9 year (95 %CI: 7.2, 22.7, P < 0.001), 10.9 years (95 %CI: 3.9, 17.7, P < 0.001), and 8.8 years (95 %CI: 4.8, 12.9, P < 0.001) of biological aging acceleration. Cubic spline models indicated exposure-response relationships where there was no safe threshold of PFAS level regarding harms to human healthy aging. The weighted sum regression model found the significant associations of PFAS compound mixture with biological aging acceleration, and PFOA was the dominant contributor among 4 PFAS compounds. Mediation analysis suggested that C-reactive protein, one of the inflammation biomarkers, might play as mediator in PFAS-induced accelerated biological aging, but not Triglyceride-glucose index. In summary, our study suggests that the effects of PFAS on biological aging acceleration should be of concern and more action plans to address their negative impact on human health should be launched.

    View details for DOI 10.1016/j.jhazmat.2024.136090

    View details for PubMedID 39405719

  • Plasma proteomics and carotid intima-media thickness in the UK biobank cohort. Frontiers in cardiovascular medicine Chen, M. L., Kho, P. F., Guarischi-Sousa, R., Zhou, J., Panyard, D. J., Azizi, Z., Gupte, T., Watson, K., Abbasi, F., Assimes, T. L. 2024; 11: 1478600

    Abstract

    Ultrasound derived carotid intima-media thickness (cIMT) is valuable for cardiovascular risk stratification. We assessed the relative importance of traditional atherosclerosis risk factors and plasma proteins in predicting cIMT measured nearly a decade later.We examined 6,136 UK Biobank participants with 1,461 proteins profiled using the proximity extension assay applied to their baseline blood draw who subsequently underwent a cIMT measurement. We implemented linear regression, stepwise Akaike Information Criterion-based, and the least absolute shrinkage and selection operator (LASSO) models to identify potential proteomic as well as non-proteomic predictors. We evaluated our model performance using the proportion variance explained (R 2).The mean time from baseline assessment to cIMT measurement was 9.2 years. Age, blood pressure, and anthropometric related variables were the strongest predictors of cIMT with fat-free mass index of the truncal region being the strongest predictor among adiposity measurements. A LASSO model incorporating variables including age, assessment center, genetic risk factors, smoking, blood pressure, trunk fat-free mass index, apolipoprotein B, and Townsend deprivation index combined with 97 proteins achieved the highest R 2 (0.308, 95% C.I. 0.274, 0.341). In contrast, models built with proteins alone or non-proteomic variables alone explained a notably lower R 2 (0.261, 0.228-0.294 and 0.260, 0.226-0.293, respectively). Chromogranin b (CHGB), Cystatin-M/E (CST6), leptin (LEP), and prolargin (PRELP) were the proteins consistently selected across all models.Plasma proteins add to the clinical and genetic risk factors in predicting a cIMT measurement. Our findings implicate blood pressure and extracellular matrix-related proteins in cIMT pathophysiology.

    View details for DOI 10.3389/fcvm.2024.1478600

    View details for PubMedID 39416432

    View details for PubMedCentralID PMC11480011

  • A plasma proteomic signature for atherosclerotic cardiovascular disease risk prediction in the UK Biobank cohort. medRxiv : the preprint server for health sciences Gupte, T. P., Azizi, Z., Kho, P. F., Zhou, J., Chen, M., Panyard, D. J., Guarischi-Sousa, R., Hilliard, A. T., Sharma, D., Watson, K., Abbasi, F., Tsao, P. S., Clarke, S. L., Assimes, T. L. 2024

    Abstract

    Background: While risk stratification for atherosclerotic cardiovascular disease (ASCVD) is essential for primary prevention, current clinical risk algorithms demonstrate variability and leave room for further improvement. The plasma proteome holds promise as a future diagnostic and prognostic tool that can accurately reflect complex human traits and disease processes. We assessed the ability of plasma proteins to predict ASCVD.Method: Clinical, genetic, and high-throughput plasma proteomic data were analyzed for association with ASCVD in a cohort of 41,650 UK Biobank participants. Selected features for analysis included clinical variables such as a UK-based cardiovascular clinical risk score (QRISK3) and lipid levels, 36 polygenic risk scores (PRSs), and Olink protein expression data of 2,920 proteins. We used least absolute shrinkage and selection operator (LASSO) regression to select features and compared area under the curve (AUC) statistics between data types. Randomized LASSO regression with a stability selection algorithm identified a smaller set of more robustly associated proteins. The benefit of plasma proteins over standard clinical variables, the QRISK3 score, and PRSs was evaluated through the derivation of Delta AUC values. We also assessed the incremental gain in model performance using proteomic datasets with varying numbers of proteins. To identify potential causal proteins for ASCVD, we conducted a two-sample Mendelian randomization (MR) analysis.Result: The mean age of our cohort was 56.0 years, 60.3% were female, and 9.8% developed incident ASCVD over a median follow-up of 6.9 years. A protein-only LASSO model selected 294 proteins and returned an AUC of 0.723 (95% CI 0.708-0.737). A clinical variable and PRS-only LASSO model selected 4 clinical variables and 20 PRSs and achieved an AUC of 0.726 (95% CI 0.712-0.741). The addition of the full proteomic dataset to clinical variables and PRSs resulted in a Delta AUC of 0.010 (95% CI 0.003-0.018). Fifteen proteins selected by a stability selection algorithm offered improvement in ASCVD prediction over the QRISK3 risk score [Delta AUC: 0.013 (95% CI 0.005-0.021)]. Filtered and clustered versions of the full proteomic dataset (consisting of 600-1,500 proteins) performed comparably to the full dataset for ASCVD prediction. Using MR, we identified 11 proteins as potentially causal for ASCVD.Conclusion: A plasma proteomic signature performs well for incident ASCVD prediction but only modestly improves prediction over clinical and genetic factors. Further studies are warranted to better elucidate the clinical utility of this signature in predicting the risk of ASCVD over the standard practice of using the QRISK3 score.

    View details for DOI 10.1101/2024.09.13.24313652

    View details for PubMedID 39314942

  • Plasma proteomic signatures for type 2 diabetes mellitus and related traits in the UK Biobank cohort. medRxiv : the preprint server for health sciences Gupte, T. P., Azizi, Z., Kho, P. F., Zhou, J., Nzenkue, K., Chen, M., Panyard, D. J., Guarischi-Sousa, R., Hilliard, A. T., Sharma, D., Watson, K., Abbasi, F., Tsao, P. S., Clarke, S. L., Assimes, T. L. 2024

    Abstract

    Aims/hypothesis: The plasma proteome holds promise as a diagnostic and prognostic tool that can accurately reflect complex human traits and disease processes. We assessed the ability of plasma proteins to predict type 2 diabetes mellitus (T2DM) and related traits.Methods: Clinical, genetic, and high-throughput proteomic data from three subcohorts of UK Biobank participants were analyzed for association with dual-energy x-ray absorptiometry (DXA) derived truncal fat (in the adiposity subcohort), estimated maximum oxygen consumption (VO 2 max) (in the fitness subcohort), and incident T2DM (in the T2DM subcohort). We used least absolute shrinkage and selection operator (LASSO) regression to assess the relative ability of non-proteomic and proteomic variables to associate with each trait by comparing variance explained (R 2 ) and area under the curve (AUC) statistics between data types. Stability selection with randomized LASSO regression identified the most robustly associated proteins for each trait. The benefit of proteomic signatures (PSs) over QDiabetes, a T2DM clinical risk score, was evaluated through the derivation of delta (Delta) AUC values. We also assessed the incremental gain in model performance metrics using proteomic datasets with varying numbers of proteins. A series of two-sample Mendelian randomization (MR) analyses were conducted to identify potentially causal proteins for adiposity, fitness, and T2DM.Results: Across all three subcohorts, the mean age was 56.7 years and 54.9% were female. In the T2DM subcohort, 5.8% developed incident T2DM over a median follow-up of 7.6 years. LASSO-derived PSs increased the R 2 of truncal fat and VO 2 max over clinical and genetic factors by 0.074 and 0.057, respectively. We observed a similar improvement in T2DM prediction over the QDiabetes score [Delta AUC: 0.016 (95% CI 0.008, 0.024)] when using a robust PS derived strictly from the T2DM outcome versus a model further augmented with non-overlapping proteins associated with adiposity and fitness. A small number of proteins (29 for truncal adiposity, 18 for VO2max, and 26 for T2DM) identified by stability selection algorithms offered most of the improvement in prediction of each outcome. Filtered and clustered versions of the full proteomic dataset supplied by the UK Biobank (ranging between 600-1,500 proteins) performed comparably to the full dataset for T2DM prediction. Using MR, we identified 4 proteins as potentially causal for adiposity, 1 as potentially causal for fitness, and 4 as potentially causal for T2DM.Conclusions/Interpretation: Plasma PSs modestly improve the prediction of incident T2DM over that possible with clinical and genetic factors. Further studies are warranted to better elucidate the clinical utility of these signatures in predicting the risk of T2DM over the standard practice of using the QDiabetes score. Candidate causally associated proteins identified through MR deserve further study as potential novel therapeutic targets for T2DM.

    View details for DOI 10.1101/2024.09.13.24313501

    View details for PubMedID 39314935

  • Large Language Models in Biomedical and Health Informatics: A Review with Bibliometric Analysis JOURNAL OF HEALTHCARE INFORMATICS RESEARCH Yu, H., Fan, L., Li, L., Zhou, J., Ma, Z., Xian, L., Hua, W., He, S., Jin, M., Zhang, Y., Gandhi, A., Ma, X. 2024
  • A novel temperature-controlled device with standardized manipulation improves chronic back pain mediated by modulating deep muscle thickness: A multicenter randomized controlled trial CLINICAL AND TRANSLATIONAL DISCOVERY Li, L., Wang, Y., Gao, Y., Liu, S., Yang, G., Lv, X., Sun, Y., Wu, Y., Li, J., Zhou, J., Chen, G. 2024; 4 (4)

    View details for DOI 10.1002/ctd2.330

    View details for Web of Science ID 001255534200001

  • The global clinical studies of long COVID. International journal of infectious diseases : IJID : official publication of the International Society for Infectious Diseases Ramonfaur, D., Ayad, N., Liu, P. H., Zhou, J., Wu, Y., Li, J., Chen, G. 2024: 107105

    Abstract

    Long COVID are those who still have symptoms, signs, and conditions after the initial phase of infection of SARS-CoV-2. The incidence of long COVID varies among regions - 31% in North America, 44% in Europe, and 51% in Asia, which is challenging the healthcare system, but there is limited guideline for its treatment. With more and more nation-wide projects funded by the government such as RECOVER initiative in US and NIHR funding in UK, an increasing number of ongoing clinical trials are investigating the efficacy of diverse therapies on reversing long COVID. After searching the WHO International Clinical Trial Registry Platform, 587 clinical studies are identified as long COVID studies. Among these, 312 studies (53.2%) are testing potential therapies. Most of the long COVID trials were conducted in the United States (58 trials [18.6%]), followed by India (55 trials [17.6%]), and Spain (20 trials [6.4%]). Interventions in these clinical trials include physical exercise, rehabilitation therapy, behavioral therapy, and pharmacological therapies including herbs, paxlovid, and fluvoxamine. These trials are aiming to deal with these long COVID symptoms and signs including fatigue, decreased pulmonary function, reduce cognitive function, and others. To date, only 11 of these 312 studies have published their results that were not confirmative unfortunately. Future studies should be designed to address sleep disorders which were seldomly included in registered clinical studies. Moreover, interventions aimed at treating the underlying pathophysiology of long COVID are also necessary but currently lacking.

    View details for DOI 10.1016/j.ijid.2024.107105

    View details for PubMedID 38782355

  • Infusion reactions to adeno-associated virus (AAV)-based gene therapy: Mechanisms, diagnostics, treatment and review of the literature CLINICAL IMMUNOLOGY Catahay, J., Notarte, K., Macasaet, R., Liu, J., Velasco, J., Peligro, P., Vallo, J., Lahoti, L., Zhou, J., Henry, B. 2024; 262
  • PLASMA PROTEOMICS AND VISCERAL ADIPOSE TISSUE VOLUME: A MACHINE LEARNING ANALYSIS OF INTERACTION BETWEEN BIOMARKERS, SOCIO-BEHAVIORAL, AND FITNESS FACTORS IN UK BIOBANK Azizi, Z., Gupte, T., Kho, P., Nzenkue, K., Zhou, J., Guarischi-Sousa, R., Panyard, D., Chen, M., Abbasi, F., Clarke, S., Tsao, P., Assimes, T. L. ELSEVIER SCIENCE INC. 2024: 1699
  • Infusion reactions to adeno-associated virus (AAV)-based gene therapy: Mechanisms, diagnostics, treatment and review of the literature. Journal of medical virology Notarte, K. I., Catahay, J. A., Macasaet, R., Liu, J., Velasco, J. V., Peligro, P. J., Vallo, J., Goldrich, N., Lahoti, L., Zhou, J., Henry, B. M. 2023; 95 (12): e29305

    Abstract

    The use of adeno-associated virus (AAV) vectors in gene therapy has demonstrated great potential in treating genetic disorders. However, infusion-associated reactions (IARs) pose a significant challenge to the safety and efficacy of AAV-based gene therapy. This review provides a comprehensive summary of the current understanding of IARs to AAV therapy, including their underlying mechanisms, clinical presentation, and treatment options. Toll-like receptor activation and subsequent production of pro-inflammatory cytokines are associated with IARs, stimulating neutralizing antibodies (Nabs) and T-cell responses that interfere with gene therapy. Risk factors for IARs include high titers of pre-existing Nabs, previous exposure to AAV, and specific comorbidities. Clinical presentation ranges from mild flu-like symptoms to severe anaphylaxis and can occur during or after AAV administration. There are no established guidelines for pre- and postadministration tests for AAV therapies, and routine laboratory requests are not standardized. Treatment options include corticosteroids, plasmapheresis, and supportive medications such as antihistamines and acetaminophen, but there is no consensus on the route of administration, dosage, and duration. This review highlights the inadequacy of current treatment regimens for IARs and the need for further research to improve the safety and efficacy of AAV-based gene therapy.

    View details for DOI 10.1002/jmv.29305

    View details for PubMedID 38116715

  • CXCL12 regulates coronary artery dominance in diverse populations and links development to disease. medRxiv : the preprint server for health sciences Rios Coronado, P. E., Zanetti, D., Zhou, J., Naftaly, J. A., Prabala, P., Kho, P. F., Martínez Jaimes, A. M., Hilliard, A. T., Pyarajan, S., Dochtermann, D., Chang, K. M., Winn, V. D., Pașca, A. M., Plomondon, M. E., Waldo, S. W., Tsao, P. S., Clarke, S. L., Red-Horse, K., Assimes, T. L. 2023

    Abstract

    Mammalian cardiac muscle is supplied with blood by right and left coronary arteries that form branches covering both ventricles of the heart. Whether branches of the right or left coronary arteries wrap around to the inferior side of the left ventricle is variable in humans and termed right or left dominance. Coronary dominance is likely a heritable trait, but its genetic architecture has never been explored. Here, we present the first large-scale multi-ancestry genome-wide association study of dominance in 61,043 participants of the VA Million Veteran Program, including over 10,300 Africans and 4,400 Admixed Americans. Dominance was moderately heritable with ten loci reaching genome wide significance. The most significant mapped to the chemokine CXCL12 in both Europeans and Africans. Whole-organ imaging of human fetal hearts revealed that dominance is established during development in locations where CXCL12 is expressed. In mice, dominance involved the septal coronary artery, and its patterning was altered with Cxcl12 deficiency. Finally, we linked human dominance patterns with coronary artery disease through colocalization, genome-wide genetic correlation and Mendelian Randomization analyses. Together, our data supports CXCL12 as a primary determinant of coronary artery dominance in humans of diverse backgrounds and suggests that developmental patterning of arteries may influence one's susceptibility to ischemic heart disease.

    View details for DOI 10.1101/2023.10.27.23297507

    View details for PubMedID 37961706

    View details for PubMedCentralID PMC10635223

  • Heat-stone massage for patients with chronic musculoskeletal pain: a protocol for multicenter randomized controlled trial. Frontiers in medicine Li, L., Xi, Y., Wang, Y., Gao, Y., Lv, X., Liu, S., Yang, G., Qian, J., Yang, X., Ayad, N., Zhou, J., Sun, Y. X., Liu, J., Li, J., Chen, G. 2023; 10: 1215858

    Abstract

    Chronic musculoskeletal pain bothers the quality of life for approximately 1.71 billion people worldwide. Although pharmacological therapies play an important role in controlling chronic pain, overuse of opioids, persistent or recurrent symptoms, and pain-related disability burden still need to be addressed. Heat-stone massage is using the heated stone to stimulate muscles and ligaments followed by massage for relax, which can potentially treat the chronic musculoskeletal pain. To determine the efficacy and safety of heat-stone massage for patients with chronic musculoskeletal pain is needed.This multicenter, 2-arm, randomized, positive drug-controlled trial will include a total of 120 patients with chronic musculoskeletal pain. The intervention group will receive a 2 week heat-stone massage, 3 times per week, whereas the control group will receive the flurbiprofen plaster twice per day for 2 weeks. The primary end point is the change in Global Pain Scale from baseline to the end of the 2 week intervention. The secondary outcomes include the pain severity (Numerical Rating Scale), pain acceptance (Chronic Pain Acceptance Questionnaire), self-management (Health Education Impact Questionnaire), self-efficacy (Pain Self-Efficacy Questionnaire), anxiety and depression (Hospital Anxiety and Depression Scale), quality of life (Short Form-36). The intention-to-treat dataset will be used for analysis.The pain management remains the research topic that patients always pay close attention to. This will be the first randomized clinical trial to evaluate whether heat-stone massage, a non-pharmacological therapy, is effective in the chronic musculoskeletal pain management. The results will provide evidence for new option of daily practice.World Health Organization Chinese Clinical Trial Registry [ChiCTR2200065654; https://www.chictr.org.cn/showproj.html?proj=185403]; International Traditional Medicine Clinical Trial Registry [ITMCTR2022000104; http://itmctr.ccebtcm.org.cn/en-US/Home/ProjectView?pid=51776b6f-77b8-4811-9b5a-a0fec10f2cee].

    View details for DOI 10.3389/fmed.2023.1215858

    View details for PubMedID 37654653

    View details for PubMedCentralID PMC10466406

  • Activation of GPR44 decreases severity of myeloid leukemia via specific targeting of leukemia initiating stem cells. Cell reports Qian, F., Nettleford, S. K., Zhou, J., Arner, B. E., Hall, M. A., Sharma, A., Annageldiyev, C., Rossi, R. M., Tukaramrao, D. B., Sarkar, D., Hegde, S., Gandhi, U. H., Finch, E. R., Goodfield, L., Quickel, M. D., Claxton, D. F., Paulson, R. F., Prabhu, K. S. 2023; 42 (7): 112794

    Abstract

    Relapse of acute myeloid leukemia (AML) remains a significant concern due to persistent leukemia-initiating stem cells (LICs) that are typically not targeted by most existing therapies. Using a murine AML model, human AML cell lines, and patient samples, we show that AML LICs are sensitive to endogenous and exogenous cyclopentenone prostaglandin-J (CyPG), Δ12-PGJ2, and 15d-PGJ2, which are increased upon dietary selenium supplementation via the cyclooxygenase-hematopoietic PGD synthase pathway. CyPGs are endogenous ligands for peroxisome proliferator-activated receptor gamma and GPR44 (CRTH2; PTGDR2). Deletion of GPR44 in a mouse model of AML exacerbated the disease suggesting that GPR44 activation mediates selenium-mediated apoptosis of LICs. Transcriptomic analysis of GPR44-/- LICs indicated that GPR44 activation by CyPGs suppressed KRAS-mediated MAPK and PI3K/AKT/mTOR signaling pathways, to enhance apoptosis. Our studies show the role of GPR44, providing mechanistic underpinnings of the chemopreventive and chemotherapeutic properties of selenium and CyPGs in AML.

    View details for DOI 10.1016/j.celrep.2023.112794

    View details for PubMedID 37459233

  • Dynamic assessment of the COVID-19 vaccine acceptance leveraging social media data JOURNAL OF BIOMEDICAL INFORMATICS Li, L., Zhou, J., Ma, Z., Bensi, M. T., Hall, M. A., Baecher, G. B. 2022; 129: 104054

    Abstract

    Vaccination is the most effective way to provide long-lasting immunity against viral infection; thus, rapid assessment of vaccine acceptance is a pressing challenge for health authorities. Prior studies have applied survey techniques to investigate vaccine acceptance, but these may be slow and expensive. This study investigates 29 million vaccine-related tweets from August 8, 2020 to April 19, 2021 and proposes a social media-based approach that derives a vaccine acceptance index (VAI) to quantify Twitter users' opinions on COVID-19 vaccination. This index is calculated based on opinion classifications identified with the aid of natural language processing techniques and provides a quantitative metric to indicate the level of vaccine acceptance across different geographic scales in the U.S. The VAI is easily calculated from the number of positive and negative Tweets posted by a specific users and groups of users, it can be compiled for regions such a counties or states to provide geospatial information, and it can be tracked over time to assess changes in vaccine acceptance as related to trends in the media and politics. At the national level, it showed that the VAI moved from negative to positive in 2020 and maintained steady after January 2021. Through exploratory analysis of state- and county-level data, reliable assessments of VAI against subsequent vaccination rates could be made for counties with at least 30 users. The paper discusses information characteristics that enable consistent estimation of VAI. The findings support the use of social media to understand opinions and to offer a timely and cost-effective way to assess vaccine acceptance.

    View details for DOI 10.1016/j.jbi.2022.104054

    View details for Web of Science ID 000788753600001

    View details for PubMedID 35331966

    View details for PubMedCentralID PMC8935963

  • Novel EDGE encoding method enhances ability to identify genetic interactions PLOS GENETICS Hall, M. A., Wallace, J., Lucas, A. M., Bradford, Y., Verma, S. S., Mueller-Myhsok, B., Passero, K., Zhou, J., McGuigan, J., Jiang, B., Pendergrass, S. A., Zhang, Y., Peissig, P., Brilliant, M., Sleiman, P., Hakonarson, H., Harley, J. B., Kiryluk, K., Van Steen, K., Moore, J. H., Ritchie, M. D. 2021; 17 (6): e1009534

    Abstract

    Assumptions are made about the genetic model of single nucleotide polymorphisms (SNPs) when choosing a traditional genetic encoding: additive, dominant, and recessive. Furthermore, SNPs across the genome are unlikely to demonstrate identical genetic models. However, running SNP-SNP interaction analyses with every combination of encodings raises the multiple testing burden. Here, we present a novel and flexible encoding for genetic interactions, the elastic data-driven genetic encoding (EDGE), in which SNPs are assigned a heterozygous value based on the genetic model they demonstrate in a dataset prior to interaction testing. We assessed the power of EDGE to detect genetic interactions using 29 combinations of simulated genetic models and found it outperformed the traditional encoding methods across 10%, 30%, and 50% minor allele frequencies (MAFs). Further, EDGE maintained a low false-positive rate, while additive and dominant encodings demonstrated inflation. We evaluated EDGE and the traditional encodings with genetic data from the Electronic Medical Records and Genomics (eMERGE) Network for five phenotypes: age-related macular degeneration (AMD), age-related cataract, glaucoma, type 2 diabetes (T2D), and resistant hypertension. A multi-encoding genome-wide association study (GWAS) for each phenotype was performed using the traditional encodings, and the top results of the multi-encoding GWAS were considered for SNP-SNP interaction using the traditional encodings and EDGE. EDGE identified a novel SNP-SNP interaction for age-related cataract that no other method identified: rs7787286 (MAF: 0.041; intergenic region of chromosome 7)-rs4695885 (MAF: 0.34; intergenic region of chromosome 4) with a Bonferroni LRT p of 0.018. A SNP-SNP interaction was found in data from the UK Biobank within 25 kb of these SNPs using the recessive encoding: rs60374751 (MAF: 0.030) and rs6843594 (MAF: 0.34) (Bonferroni LRT p: 0.026). We recommend using EDGE to flexibly detect interactions between SNPs exhibiting diverse action.

    View details for DOI 10.1371/journal.pgen.1009534

    View details for Web of Science ID 000664356500001

    View details for PubMedID 34086673

    View details for PubMedCentralID PMC8208534

  • Phenome-wide association studies on cardiovascular health and fatty acids considering phenotype quality control practices for epidemiological data. Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing Passero, K., He, X., Zhou, J., Mueller-Myhsok, B., Kleber, M. E., Maerz, W., Hall, M. A. 2020; 25: 659-670

    Abstract

    Phenome-wide association studies (PheWAS) allow agnostic investigation of common genetic variants in relation to a variety of phenotypes but preserving the power of PheWAS requires careful phenotypic quality control (QC) procedures. While QC of genetic data is well-defined, no established QC practices exist for multi-phenotypic data. Manually imposing sample size restrictions, identifying variable types/distributions, and locating problems such as missing data or outliers is arduous in large, multivariate datasets. In this paper, we perform two PheWAS on epidemiological data and, utilizing the novel software CLARITE (CLeaning to Analysis: Reproducibility-based Interface for Traits and Exposures), showcase a transparent and replicable phenome QC pipeline which we believe is a necessity for the field. Using data from the Ludwigshafen Risk and Cardiovascular (LURIC) Health Study we ran two PheWAS, one on cardiac-related diseases and the other on polyunsaturated fatty acids levels. These phenotypes underwent a stringent quality control screen and were regressed on a genome-wide sample of single nucleotide polymorphisms (SNPs). Seven SNPs were significant in association with dihomo-γ-linolenic acid, of which five were within fatty acid desaturases FADS1 and FADS2. PheWAS is a useful tool to elucidate the genetic architecture of complex disease phenotypes within a single experimental framework. However, to reduce computational and multiple-comparisons burden, careful assessment of phenotype quality and removal of low-quality data is prudent. Herein we perform two PheWAS while applying a detailed phenotype QC process, for which we provide a replicable pipeline that is modifiable for application to other large datasets with heterogenous phenotypes. As investigation of complex traits continues beyond traditional genome wide association studies (GWAS), such QC considerations and tools such as CLARITE are crucial to the in the analysis of non-genetic big data such as clinical measurements, lifestyle habits, and polygenic traits.

    View details for PubMedID 31797636

  • Investigation of gene-gene interactions in cardiac traits and serum fatty acid levels in the LURIC Health Study PLOS ONE Zhou, J., Passero, K., Palmiero, N. E., Mueller-Myhsok, B., Kleber, M. E., Maerz, W., Hall, M. A. 2020; 15 (9): e0238304

    Abstract

    Epistasis analysis elucidates the effects of gene-gene interactions (G×G) between multiple loci for complex traits. However, the large computational demands and the high multiple testing burden impede their discoveries. Here, we illustrate the utilization of two methods, main effect filtering based on individual GWAS results and biological knowledge-based modeling through Biofilter software, to reduce the number of interactions tested among single nucleotide polymorphisms (SNPs) for 15 cardiac-related traits and 14 fatty acids. We performed interaction analyses using the two filtering methods, adjusting for age, sex, body mass index (BMI), waist-hip ratio, and the first three principal components from genetic data, among 2,824 samples from the Ludwigshafen Risk and Cardiovascular (LURIC) Health Study. Using Biofilter, one interaction nearly met Bonferroni significance: an interaction between rs7735781 in XRCC4 and rs10804247 in XRCC5 was identified for venous thrombosis with a Bonferroni-adjusted likelihood ratio test (LRT) p: 0.0627. A total of 57 interactions were identified from main effect filtering for the cardiac traits G×G (10) and fatty acids G×G (47) at Bonferroni-adjusted LRT p < 0.05. For cardiac traits, the top interaction involved SNPs rs1383819 in SNTG1 and rs1493939 (138kb from 5' of SAMD12) with Bonferroni-adjusted LRT p: 0.0228 which was significantly associated with history of arterial hypertension. For fatty acids, the top interaction between rs4839193 in KCND3 and rs10829717 in LOC107984002 with Bonferroni-adjusted LRT p: 2.28×10-5 was associated with 9-trans 12-trans octadecanoic acid, an omega-6 trans fatty acid. The model inflation factor for the interactions under different filtering methods was evaluated from the standard median and the linear regression approach. Here, we applied filtering approaches to identify numerous genetic interactions related to cardiac-related outcomes as potential targets for therapy. The approaches described offer ways to detect epistasis in the complex traits and to improve precision medicine capability.

    View details for DOI 10.1371/journal.pone.0238304

    View details for Web of Science ID 000571887500145

    View details for PubMedID 32915819

    View details for PubMedCentralID PMC7485803

  • Long Non-coding RNA TDRKH-AS1 Promotes Colorectal Cancer Cell Proliferation and Invasion Through the beta-Catenin Activated Wnt Signaling Pathway FRONTIERS IN ONCOLOGY Jiao, Y., Zhou, J., Jin, Y., Yang, Y., Song, M., Zhang, L., Zhou, J., Zhang, J. 2020; 10: 639

    Abstract

    Colorectal cancer (CRC) is a common cancer worldwide, with a lower 5-years survival rate. Recently, long non-coding RNAs (lncRNAs) have been well-studied as the oncogenes or the tumor suppressors in multiple malignancies, including CRC. However, their biological functions and potential mechanisms in human cancer remain unclear. Here, we evaluated the expression of TDRKH-AS1 in CRC tissues and identified its potential targets. We found that TDRKH-AS1 is upregulated in majority of CRC patients, which is also significantly correlated with their malignant characteristics and their dismal prognoses. The high expression of TDRKH-AS1 can promote cancer cell proliferation substantially and invasion based on in vitro experiments. We also recognized that the TDRKH-AS1 targets the β-catenin in the Wnt signaling pathway to exert its carcinogenic activity. TDRKH-AS1 could serve as a promising prognostic predictor and a potential therapeutic target for further early diagnoses and treatments via a non-invasive method.

    View details for DOI 10.3389/fonc.2020.00639

    View details for Web of Science ID 000538517900001

    View details for PubMedID 32670860

    View details for PubMedCentralID PMC7326065

  • CLARITE Facilitates the Quality Control and Analysis Process for EWAS of Metabolic-Related Traits FRONTIERS IN GENETICS Lucas, A. M., Palmiero, N. E., McGuigan, J., Passero, K., Zhou, J., Orie, D., Ritchie, M. D., Hall, M. A. 2019; 10: 1240

    Abstract

    While genome-wide association studies are an established method of identifying genetic variants associated with disease, environment-wide association studies (EWAS) highlight the contribution of nongenetic components to complex phenotypes. However, the lack of high-throughput quality control (QC) pipelines for EWAS data lends itself to analysis plans where the data are cleaned after a first-pass analysis, which can lead to bias, or are cleaned manually, which is arduous and susceptible to user error. We offer a novel software, CLeaning to Analysis: Reproducibility-based Interface for Traits and Exposures (CLARITE), as a tool to efficiently clean environmental data, perform regression analysis, and visualize results on a single platform through user-guided automation. It exists as both an R package and a Python package. Though CLARITE focuses on EWAS, it is intended to also improve the QC process for phenotypes and clinical lab measures for a variety of downstream analyses, including phenome-wide association studies and gene-environment interaction studies. With the goal of demonstrating the utility of CLARITE, we performed a novel EWAS in the National Health and Nutrition Examination Survey (NHANES) (N overall Discovery=9063, N overall Replication=9874) for body mass index (BMI) and over 300 environment variables post-QC, adjusting for sex, age, race, socioeconomic status, and survey year. The analysis used survey weights along with cluster and strata information in order to account for the complex survey design. Sixteen BMI results replicated at a Bonferroni corrected p < 0.05. The top replicating results were serum levels of g-tocopherol (vitamin E) (Discovery Bonferroni p: 8.67x10-12, Replication Bonferroni p: 2.70x10-9) and iron (Discovery Bonferroni p: 1.09x10-8, Replication Bonferroni p: 1.73x10-10). Results of this EWAS are important to consider for metabolic trait analysis, as BMI is tightly associated with these phenotypes. As such, exposures predictive of BMI may be useful for covariate and/or interaction assessment of metabolic-related traits. CLARITE allows improved data quality for EWAS, gene-environment interactions, and phenome-wide association studies by establishing a high-throughput quality control infrastructure. Thus, CLARITE is recommended for studying the environmental factors underlying complex disease.

    View details for DOI 10.3389/fgene.2019.01240

    View details for Web of Science ID 000504982600001

    View details for PubMedID 31921293

    View details for PubMedCentralID PMC6930237