Portfolio

I am now in the world of data science and machine learning, where I have been applying the same curiosity and knack for problem-solving that brought be success as a researcher during my math PhD and postdoc, together with the skills I acquired in the intensive Springboard data science program. I am experienced with the full cycle of a data science project, including wrangling/cleaning data, exploratory analysis, reporting, and visualization using Python and SQL. In addition to being a curious investigator, I am a confident programmer and relentless trouble-shooter. I am also adept at presenting highly technical findings to both experts and non-experts, having presented my research at 10+ conferences and seminars and taught a number of undergraduate calculus courses throughout my career in academia.

Below is a selection of the data science and programming projects I have completed in the past couple of years.

Predicting NHL Player Performance

Slides | Github

Data science project, creating a predictive model for NHL player stats based on their recent performance. The final regression model reducing error by >10% compared to the best baseline. 

Lean Conjecture Generator

Slides | Github

Generative machine learning model to augment LeanDojo’s ReProver, an AI tool for automated proof search in the Lean theorem-prover.

Personalized Los Angeles Movie Screening Tracker

Github

Python tool that identifies movie screenings at independent movie theaters in the LA area from user-provided lists of movies.

Verification of an Election-based Consensus Algorithm

Github

Overhaul of an existing model of a consensus algorithm in a distributed system, changing from a first-come-first-serve algorithm to a voting algorithm. Formally verified that the algorithm behaves as desired.