flynn.gg

Christopher Flynn

Machine Learning
Systems Architect,
PhD Mathematician

Home
Projects
Open Source
Blog
Résumé

GitHub
LinkedIn

Projects


Contents

Professional

Personal


Professional Projects

Fanduel Play Action

The machine learning and data platform we built at SimpleBet powers Fanduel Play Action. This is a contest-driven free-to-play product which allows players to bet on the outcomes of plays and drives in live NFL matches in real-time.

Real-time Machine Learning Systems for Sports Betting

At SimpleBet, I lead an engineering team building the software and infrastructure to enable real-time automated pricing of sports betting micro-markets using machine learning. Our machine learning platform is built using Python and deployed and orchestrated using Kubernetes.

Python PostgreSQL MLflow Airflow Kubernetes

The platform makes heavy use of Apache Airflow, PostgreSQL, and MLflow to manage data and machine learning model lifecycles.

Mobile Gaming Analytics Platform

Prometheus Grafana Locust Jenkins Elasticsearch

At Tilting Point my focus was to migrate from a serverless, manually controlled infrastructure to an automated live server environment. The main goals were to decrease cost (serverless becomes expensive at scale), automate the entire build/test/deploy process, and provide more visibility to upstream monitoring with detailed metric instrumentation and visualization.

The set of tools I decided to use consisted of Prometheus for live monitoring, Grafana for monitoring visualization, Locust for load testing, and Jenkins for CI/CD automation. All of these solutions were deployed with Docker in AWS. The logging solution was the ELK (Elasticsearch, Logstash, Kibana) stack.

For ephemeral job metrics, a Prometheus Pushgateway was utilized for observability into our Spark Structured Streaming analytics jobs on Databricks, and any batch processing tasks launched by either Databricks or Apache Airflow. Grafana was also hooked directly into the Airflow metastore database to provide more detailed overviews of task execution status.

Mobile Gaming Big Data Systems

Golang Airflow Spark Databricks Delta Lake

At Tilting Point I rebuilt the data analytics platform. I migrated the entire data pipeline from Jenkins to an Apache Airflow cluster, with lots of custom management plugins to the webserver. All of the daily data ingestion tasks were handled by PySpark Databricks jobs. The mobile game analytics tables were moved from standard parquet tables ingested in batch to Delta Lake tables ingested using Spark Structured Streaming. I also migrated the analytics ingress from Serverless API Gateway to Golang servers which saved thousands in monthly operational costs.

Magic Meadow Analytics Platform

Magic Meadow

At Mitosis Games, I built the analytics backend while the team built their first title, Magic Meadow, a story-driven fantasy-themed match-3 adventure.

Within AWS, I architected a data analytics pipeline using Kinesis Firehose to consume events fired from clients to our API, with Redshift as a data warehouse. There was a separate PostgreSQL database using RDS that acts as a datastore for aggregate analytics data. This database backed an analytics dashboard I built using Python Flask and plotly.js, deployed on Elastic Beanstalk. Both databases were managed using the alembic Python package. Automated tasks and other cron operations were performed using Python and Celery deployed to EC2 with a Redis Elasticache broker.

Black Diamond Casino

Black Diamond Casino

At Zynga I worked as a Data Scientist with the Rising Tide Games team. Our studio produced a slot machine mobile app called Black Diamond Casino. The app performed well with peak metrics of roughly 250,000 daily active users, #15 top-grossing casino (#53 top-grossing overall) on iOS, #17 top-grossing casino (#66 top-grossing overall) on Android in mid-2016.

My responsibilities were mainly product analytics, including generating automated reports and visualizations on product data and experiments using Amazon Redshift and Python. I developed mathematical models for retention, daily cohorted lifetime value forecasting of monetizers, late conversion, and churn. I created several metrics to measure aspects of player behavior, game interest, and monetization habits.

I collaborated with engineers and artists to develop and optimize in-game features, such as rewarded advertisements and achievements. I also performed live operations on the app such as content releases, A/B testing, and sale/promo execution.

Verizon Customer Churn Model

Ask Verizon

As a Data Science Intern at Verizon Wireless, I developed a subscriber churn prediction model based on online support chat conversation contents using natural language processing and machine learning. The model used several machine learning techniques including Naive Bayes, Logistic Regression, and Random Forests. Feature selection was performed using Information Gain. The models were evaluated using F-score with an emphasis on recall performance.

Copy Correlation Signal Detection

MIT Lincoln Lab

MIT Lincoln Lab hosted me as an intern in their Advanced Sensor Techniques group for three summers while I was working on my PhD. I worked with engineers and mathematicians to evaluate the performance of an estimation algorithm they created. The estimator is known as Copy Correlation, and its purpose is to approximate the direction of arrival of an incoming signal on an antenna array.

I completed derivations to identify the theoretical mean squared error performance of the estimator, using linear algebra and complex-valued probability distributions. The derivations were verified by Monte Carlo simulations coded in MATLAB.

The results are published in conference papers here and here.


Personal Projects

PyPIStats

pypistats.org

A simple dashboard website and JSON API for providing aggregate download stats for python packages from the Python Package Index (PyPI). The website is built in python using Flask with Celery executing ingestion tasks, Celery-beat as a task scheduler, and redis as a message queue/task broker. Visualizations use the plotly.js library. The site supports GitHub OAuth to allow users to personally track packages. The website is deployed to a Kubernetes cluster on Google Cloud’s GKE.

OTPSpectate

(Inactive) A fully automated twitch.tv stream and twitch chat bot showcasing the highest ranked ‘one-trick pony’ League of Legends players from all over the world. The stream controller is written in python3 using the aiohttp library for Riot Games API calls, the asyncio library to interact with Twitch IRC Chat, and the pywinauto library to control the League of Legends client and OBS Studio. Check here and here for some more info.