Bio


I am pursuing an M.S. in Statistics Data Science at Stanford University (with coursework in Statistics, Computer Science, and Computational and Mathematical Engineering). Before Stanford, I graduated summa cum laude with a B.S. in Statistics from the University of California, Los Angeles (UCLA). I am a Data Science Intern with Bridg working on data querying, natural language processing (NLP), and machine learning with Python, SQL, and Snowflake on terabytes of data (over 100 billion observations) to improve insights on product descriptions and feature standardization across various sources.

During my senior year at UCLA, I was a Data Analyst Intern with SCAN Health Plan performing NLP and unsupervised learning (agglomerative clustering) in Python to analyze call center data while also creating Tableau dashboards. I also was a Data Science Consultant with UCLA’s Data Science Center, working as a consultant to meet ad-hoc and long-term requests from clients in varied fields. Additionally, I was the president of Bruin Sports Analytics, combining sports and analytics by guiding our data journalism, research, and consulting teams to produce deliverables for sports fans and UCLA’s intercollegiate teams.

I am always willing to discuss potential work opportunities or my path with prospective undergraduate or graduate students or data science enthusiasts. Feel free to contact me via email, LinkedIn, or my personal website.

Honors & Awards


  • Summa Cum Laude, University of California, Los Angeles (UCLA) (June 2022)
  • Best Visualization (Honorable Mention), UCLA DataFest (April 2021)
  • Employee of the Quarter, Lifetime Activities (October 2020)
  • National AP Scholar, College Board (July 2019)
  • U.S. Presidential Scholars Program Candidate, U.S. Department of Education (February 2019)

Education & Certifications


  • Master of Science, Stanford University, Statistics Data Science (2024)
  • Bachelor of Science, University of California, Los Angeles (UCLA), Statistics (2022)
  • High School Diploma, Amador Valley High School (2019)

Work Experience


  • Data Science Intern, Bridg / Cardlytics (June 2022 - Present)

    • Construct machine learning models with AWS Sagemaker (in Python) and Snowflake (SQL) to analyze terabytes of data (over 100 billion observations)
    • Develop natural language processing (NLP) algorithms to standardize and categorize product descriptions for enhanced business analytics
    • Deploy machine learning models to identify potential store location openings and closings
    • Assist product recommendation and churn identification via machine learning models in Sagemaker and Snowpark (Snowflake Python API)

    Location

    San Francisco

  • Data Analyst Intern, SCAN Health Plan (November 2021 - June 2022)

    • Generated insights about SCAN membership experience challenges by completing end-to-end projects analyzing “Voice of the Consumer” (VOC) data using Python, R, SQL, Tableau
    • Analyzed member call data (over 1 million calls) to create FAQ sections using natural language processing (NLP) and unsupervised clustering in Python for benefit categories and disenrollment groups
    • Queried disenrollment data from SQL to create visualizations in Tableau and R (ggplot) then analyzed which demographics had high disenrollment rates via chi-square and posthoc analyses
    • Analyzed the relationship between disenrollments and frequent member inquiries and grievances for disenrollees during OEP in recent years in Python and Tableau
    • Presented findings, insights, and learnings to the leadership team and in all-teams meetings

    Location

    Los Angeles

  • Data Science Consultant, UCLA Library Data Science Center (November 2021 - June 2022)

    • Consulted to curate, transform, analyze, and visualize researcher data using Python, R, Tableau, SPSS
    • Implemented and refined machine learning models including K-Nearest Neighbors (KNN), decision trees, neural networks
    • Performed statistical tests including chi-square analyses, Peacock tests (i.e., a multidimensional KS test)
    • Reverse geolocated addresses from latitude and longitude using Google’s API key
    • Implemented regular expressions to parse text for analysis, extract highlighted text from Word documents with Python
    • Extracted data from .doc, .xls, and .xlsx files for cleaning and parsing with R
    • Conducted and interpreted linear and logistic regression in SPSS
    • Member of the DataSquad, a team within the UCLA Library Data Science Center

    Location

    Los Angeles

  • Student Researcher, UCLA Department of Statistics (January 2022 - June 2022)

    • Reviewed and clarified homework instructions for upper division statistics courses in statistical programming and simulation in R
    • Provided students with additional guidance through the redesign of instruction and template files
    • Research conducted within the UCLA Department of Statistics (under Dr. Miles Chen)

    Location

    Los Angeles

  • Student Assistant, UCLA Luskin School of Public Affairs (June 2021 - March 2022)

    • Webscraped firm-level data with Selenium webdriver in R and analyzed the data about the political connections for various enterprises
    • Imputed missing data for over 30 million firms
    • Used Google Translate API to convert various job titles and board positions from 5+ languages into English
    • Research conducted under Dr. Darin Christensen (UCLA) and with Dr. Jonah Rexer (Princeton)

    Location

    Los Angeles

  • InStep Intern (Data Analytics and AI), Infosys (June 2021 - August 2021)

    • Defined and developed machine learning models in Python to analyze and detect patterns of play and their success and failure percentages for Roland Garros (the French Open) and the Australian Open to improve user experience for media, players, and coaches
    • Supported creation of AI commentary through natural language generation to provide engaging point-by-point description of Roland Garros and Australian Open matches using Python to improve the user interfaces for tennis fans

    Location

    San Francisco

  • Student Researcher, UCLA Department of Statistics (January 2020 - June 2020)

    • Developed automated, interactive lecture notes for undergraduate statistics students
    • Implemented the learnR package (interactive R code, automated review questions in R) under Dr. Michael Tsiang

    Location

    Los Angeles