Bio


* Part-time Adult, Lover for Hiking, Photograph, Jazz, Surfing, Pool
* AI4Health
* How Human make better AI? How AI make better Human?
* I want to make: Anticancer Drugs, Contraceptive for Male, Artificial Womb, Weight Loss Pills

Don't create opium, create a forest, create air and water

Honors & Awards


  • Siyuan Leadership Scholarship, Siyuan Foundation, Tsinghua University (2021)
  • National Second Prize in 36th CPhO, The Chinese Physical Society (2019)
  • National Scholarship of China, Chinese Minister of Education (2022)

Membership Organizations


Education & Certifications


  • Bachelor of Elec Engineering, Tsinghua University, Electronic Engineering (2024)
  • Visiting Undergraduate, Harvard College, Liberal Art and Science (2023)
  • Bachelor of Engineering, Tsinghua University, Electrical Engineering (2024)

Work Experience


  • Visiting Scholar, HCIE, CSAIL, Massachusetts Institute of Technology (6/1/2023 - 7/22/2023)

    Location

    32 Vassar Street, Cambridge

  • Research Assistant, Harvard Medical School, DBMI (7/1/2022 - 2/28/2024)

    Location

    10 Shattuck St, Boston

  • Investing Analyst, MiraclePlus(Former YC China) (7/23/2023 - 9/30/2023)

    Location

    150 Chengfu Street, Beijing

  • Assistant to Chief Executive Officer, MindOS (6/24/2024 - Present)

    Location

    EFC City, Hangzhou

All Publications


  • LATTE: Label-efficient incident phenotyping from longitudinal electronic health records. Patterns (New York, N.Y.) Wen, J., Hou, J., Bonzel, C. L., Zhao, Y., Castro, V. M., Gainer, V. S., Weisenfeld, D., Cai, T., Ho, Y. L., Panickan, V. A., Costa, L., Hong, C., Gaziano, J. M., Liao, K. P., Lu, J., Cho, K., Cai, T. 2024; 5 (1): 100906

    Abstract

    Electronic health record (EHR) data are increasingly used to support real-world evidence studies but are limited by the lack of precise timings of clinical events. Here, we propose a label-efficient incident phenotyping (LATTE) algorithm to accurately annotate the timing of clinical events from longitudinal EHR data. By leveraging the pre-trained semantic embeddings, LATTE selects predictive features and compresses their information into longitudinal visit embeddings through visit attention learning. LATTE models the sequential dependency between the target event and visit embeddings to derive the timings. To improve label efficiency, LATTE constructs longitudinal silver-standard labels from unlabeled patients to perform semi-supervised training. LATTE is evaluated on the onset of type 2 diabetes, heart failure, and relapses of multiple sclerosis. LATTE consistently achieves substantial improvements over benchmark methods while providing high prediction interpretability. The event timings are shown to help discover risk factors of heart failure among patients with rheumatoid arthritis.

    View details for DOI 10.1016/j.patter.2023.100906

    View details for PubMedID 38264714

    View details for PubMedCentralID PMC10801250