One of the main engineering priorities at Tilting Point has been to migrate from a serverless, manually controlled infrastructure to an automated live server environment. The main goals have been to decrease cost (serverless becomes expensive at scale), automate the entire build/test/deploy process, and provide more visibility to upstream monitoring with detailed metric instrumentation and visualization.
Prior to migrating to live servers, it was my responsibility to R&D solutions, and subsequently build out our internal operations environment. The set of tools consisted of Prometheus for live monitoring, Grafana for monitoring visualization, Locust for load testing, and Jenkins for CI/CD automation. All of these solutions are deployed using Docker containers in AWS. The logging solution is the ELK (Elasticsearch, Logstash, Kibana) stack.
In addition, a Prometheus Pushgateway is utilized for maintaining metrics for our Spark Structured Streaming analytics jobs on Databricks, and any ephemeral batch processing tasks launched by either Databricks or Apache Airflow. Grafana also hooks directly into the Airflow metastore database to provide more detailed overviews of task execution status.
When I joined Tilting Point their data ingestion pipeline was not very reliable. Most jobs were scheduled and executed on a Jenkins instance. None of the jobs had error handling, retry logic, or any kind of logging. Job failures were common, nearly impossible to diagnose, and the vast majority of their data tables were missing days or weeks of data from previously failed runs. The data science and user acquisition teams rely heavily on accurate and timely data and required a more robust system.
Tasked with alleviating these issues, I migrated the entire data pipeline to an Apache Airflow cluster with Celery workers. Most of the job code needed a complete rewrite due to the afore-mentioned problems. I also customized the Airflow deployment to log all activity to S3, force SSL connections, added authentication through GitHub OAuth (Airflow only supports GitHub Enterprise out of the box), and added a custom Slack integration for posting execution errors to our
Since deploying to production, all missing data has been backfilled programmatically with Airflow, and the only failures reported in job execution have been due to third party systems experiencing downtimes or delays in data delivery. In addition, many of the jobs are dynamically defined, making it trivial to integrate new apps into the existing pipeline.
I acted as the Data Engineer/Scientist (mostly engineer) for Mitosis Games while the team built their first title, Magic Meadow (previously Forest Escape), a story-driven fantasy-themed match-3 adventure.
I spent most of my time working on the data pipeline and setting up analytics infrastructure. Within AWS, I architected a data pipeline using Kinesis Firehose to consume analytics events fired from clients to our API, with Redshift as a data warehouse. There is also a separate PostgreSQL database using RDS that acts as a datastore for aggregate analytics data. This database also serves as a backend for an analytics dashboard I wrote using Python Flask and plotly.js which is deployed on Elastic Beanstalk. Both databases are managed using the alembic Python package.
Analytics tasks and other cron operations were written and configured using Python and a Celery distributed task queue project deployed to EC2 with a Redis Elasticache broker. The analytics codebase also contains lots of custom library code for interfacing with other AWS components and third party integrations.
I also built a Slack application called Synthesis using Python Tornado. Synthesis’ primary role is aggregate data retrieval for analytics, but also handles simple management responsibilities and basic CRM tasks.
I worked as a Data Scientist with the Rising Tide Games team at Zynga. Our studio produced a slot machine mobile app called Black Diamond Casino. The app performed well with peak metrics of roughly 250,000 daily active users, #15 top-grossing casino (#53 top-grossing overall) on iOS, #17 top-grossing casino (#66 top-grossing overall) on Android in mid-2016.
I spent much of my time doing product analytics, with responsibilities that included creating automated reports and visualizations on product data and experiments using Amazon Redshift and Python. I developed mathematical models for retention, daily cohorted lifetime value forecasting of monetizers, late conversion, and churn. I created several metrics to measure aspects of player behavior, game interest, and monetization habits.
I collaborated with engineers and artists to develop in-game features, such as rewarded advertisements and achievements. I also performed live operations on the app such as content releases, A/B testing, and sale/promo execution.
As a Data Science Intern at Verizon Wireless, I developed a churn prediction model based on online support chat conversation contents. The model used several machine learning techniques including Naive Bayes, Logistic Regression and Random Forests. Feature selection was performed using Information Gain. The models were evaluated using F-score with an emphasis on recall performance.
MIT Lincoln Lab hosted me as an intern in their Advanced Sensor Techniques group for three summers while I was working on my PhD.
I worked with engineers and mathematicians to evaluate the performance of an estimation algorithm they created. The estimator is known as Copy Correlation, and its purpose is to approximate the direction of arrival of an incoming signal on an antenna array.
I completed derivations to identify the theoretical mean squared error performance of the estimator, using a lot of linear algebra and complex-valued probability distributions. The results were verified by Monte Carlo simulations coded in MATLAB.
A simple dashboard website and JSON API for providing aggregate download stats for python packages from the Python Package Index (PyPI). The website is built in python using Flask with Celery executing ingestion tasks, Celery-beat as a task scheduler, and redis as a message queue/task broker. Visualizations use the plotly.js library. The site supports GitHub OAuth to allow users to personally track packages. The website is deployed to AWS Elastic beanstalk using Docker and processes are managed using supervisor.
A fully automated twitch.tv stream and twitch chat bot showcasing the highest ranked ‘one-trick pony’ League of Legends players from all over the world. The stream controller is written in python3 using the
aiohttp library for Riot Games API calls, the
asyncio library to interact with Twitch IRC Chat, and the
pywinauto library to control the League of Legends client and OBS Studio. Check here and here for some more info.