Summary

Software / Data / Machine Learning Engineer with 10+ years of experience in building business-critical applications that rely heavily on data and machine learning.

Ideal Job

I am looking for opportunities in the San Diego area (or remote) where I can significantly impact the business, work closely with researchers / data scientists to develop, productionize, and scale models, and continually learn and grow as an engineer.

Key Skills

Platforms

Amazon Web Services (AWS), Google Cloud, On-Premise Cluster

Programming Languages

Ruby (Rails), Python (Flask), Go, Kotlin, SQL, HTML, JS (Ember)

Databases

Snowflake, Presto, Athena, Redshift, Postgres, Redis, Dynamo

Data Tools & Frameworks

Airflow, Luigi, Glue, Hadoop, Hive, Spark, Kafka / Kinesis

Other Technologies

Docker, Git, Kubernetes, DataDog, Tableau, Looker, Lambda, SQS

Career

Senior Software Engineer

2020 - Present
Brex, Remote

Building a ML platform for Data Scientists to build and deploy models easier.

Senior Software Engineer

2019 - 2020
ScaleFactor, Remote

Automating small-business accounting with full-stack web development and ML pipelining.

Senior Software Engineer

2017 - 2019
Spiceworks, Austin

Architected and developed Petabyte-scale data pipelines feeding ML models.

Software Engineer

2014 - 2017
Spiceworks, Austin

Wrote business-critical, high-volume, low-latency systems, mostly in Golang.

Software Development Intern

2010 - 2014
Spiceworks, Austin

Learned how to work in agile teams and translate business requirements to code.

Experience

Config-Driven Model Development Standardized ML pipeline solution for all of Data Science
  • Led a team of 2 other engineers and a Data Scientist
  • Interviewed users, designed a generalized solution, and engineered it
  • Reduced the time to properly train a new model from months to hours
Feature Store A service for computing and caching data
  • Designed and rolled out changes to make it easy and intuitive to define features
  • Implemented logic to dynamically check if computations had been run before
  • Extensively documented the capabilities
  • Provided dedicated support to ensure teams were able to use the service effectively
Member of Python Core Team
  • Provided Python environment support for ~50 Data Science and Analytics teammates
  • Upgraded 20+ packages to ensure M1 Macs were viable for Python development
  • Transitioned development environments inside of Docker
  • Taught an onboarding course for setting up development environments
ML Platform Support
  • Primary point-of-contact for answering questions about the data platform on rotation
  • Rigorously reviewed design documents and provided in-depth code reviews
  • Mentored junior developers and onboarded senior members to the team
  • Handled several time-sensitive incidents while on call for ML Platform services
Underwriting The process of setting monthly credit limits for credit cards
  • Maintained a Feature Store to make implementing sophisticated credit policies easier
  • Worked with Data Science team and policy experts to implement credit policies
  • Helped implement several ML models to be used in credit policies
Fraud Detection Identifying bad actors to eliminate them from the platform
  • Maintained a 100ms p99 latency SLA for identifying fraudulent card transactions
  • Worked with Data Scientists and fraud experts
  • Helped implement several ML models to be used for fraud detection
Summary Statistics Display continuously-updated summary statistics of datasets
  • Paired with junior developer to implement
  • Wrote frontend and backend code
  • Made it easier for the data team to quickly sift through thousands of tables
ML for Bookkeeping An ML workflow for automatically classifying bank transactions
  • Reduced need for humans to classify transactions manually
  • Refined frontend, backend, and modeling code for production use
  • Analyzed model performance
Accounting File Report Card Self-service health summary dashboard of business accounting books
  • Integrated with 3rd parties to collect and parse accounting reports
  • Ran analysis to determine healthy accounting patterns
  • Created portal for calculating and viewing insights
Real Time Bidder Set of services for intelligently bidding on ad inventory
  • Wrote services that respond in under 10ms (including network latency)
  • Horizontally scaled web server up to 1 billion requests per day
  • Designed micro-service architecture comprising 10+ services
  • Productionized machine learning model to compute bids
  • Helped scale product from $0 to ~$10M in annual revenue
  • Rigorously tested and wrote monitoring
Business Identity Graph Database of companies who visited Spiceworks
  • Generated graph from millions of data points from several sources
  • Designed data model where graph was iteratively updated in-place
  • Resolved business attributes from a connected component of the graph
Account Intelligence Dashboard showing businesses and their interests
  • Wrote data pipeline to calculate purchase intent of businesses
  • Integrated Tensorflow model from data science team into pipeline
  • Architected pipeline to leverage business identity graph
Pixel Tracking Web-tracking service integrated with 100's of companies
  • Led rewrite of service
  • Integrated open-source browser fingerprinting
Segmentation service Puts people into segments based on their actions
  • Wrote jobs to send billions of events into system every month
  • Integrated with 3rd parties using cookie matching
  • Debugged / resolved problems around user-identity mappings
  • Collaborated with Data Science to incorporate intent modeling

Hackathons

Horoscope Generator
  • Used a RNN model
  • Trained using a GPU
Anonymous email between IT buyers and vendors
  • Wrote email proxy service that obfuscated the sender’s address
  • Won the hackathon
Latent engagement grouping in the Spiceworks community forums
  • Generated graph of ~8M posts and their relationships
  • Used page rank and clustering algorithms to calculate latent clusters
  • Visualized using T-SNE
Hierarchical text clustering
  • Used LDA to cluster a few hundred texts from a personal feed
  • Visualized text clusters using pyLDAvis