Summary
Software / Data / Machine Learning Engineer with 10+ years of experience in building business-critical applications that rely heavily on data and machine learning.
Ideal Job
I am looking for opportunities in the San Diego area (or remote) where I can significantly impact the business, work closely with researchers / data scientists to develop, productionize, and scale models, and continually learn and grow as an engineer.
Key Skills
Platforms
Amazon Web Services (AWS), Google Cloud, On-Premise ClusterProgramming Languages
Ruby (Rails), Python (Flask), Go, Kotlin, SQL, HTML, JS (Ember)Databases
Snowflake, Presto, Athena, Redshift, Postgres, Redis, DynamoData Tools & Frameworks
Airflow, Luigi, Glue, Hadoop, Hive, Spark, Kafka / KinesisOther Technologies
Docker, Git, Kubernetes, DataDog, Tableau, Looker, Lambda, SQSCareer
Building a ML platform for Data Scientists to build and deploy models easier.
Automating small-business accounting with full-stack web development and ML pipelining.
Architected and developed Petabyte-scale data pipelines feeding ML models.
Wrote business-critical, high-volume, low-latency systems, mostly in Golang.
Learned how to work in agile teams and translate business requirements to code.
Experience
- Led a team of 2 other engineers and a Data Scientist
- Interviewed users, designed a generalized solution, and engineered it
- Reduced the time to properly train a new model from months to hours
- Designed and rolled out changes to make it easy and intuitive to define features
- Implemented logic to dynamically check if computations had been run before
- Extensively documented the capabilities
- Provided dedicated support to ensure teams were able to use the service effectively
- Provided Python environment support for ~50 Data Science and Analytics teammates
- Upgraded 20+ packages to ensure M1 Macs were viable for Python development
- Transitioned development environments inside of Docker
- Taught an onboarding course for setting up development environments
- Primary point-of-contact for answering questions about the data platform on rotation
- Rigorously reviewed design documents and provided in-depth code reviews
- Mentored junior developers and onboarded senior members to the team
- Handled several time-sensitive incidents while on call for ML Platform services
- Maintained a Feature Store to make implementing sophisticated credit policies easier
- Worked with Data Science team and policy experts to implement credit policies
- Helped implement several ML models to be used in credit policies
- Maintained a 100ms p99 latency SLA for identifying fraudulent card transactions
- Worked with Data Scientists and fraud experts
- Helped implement several ML models to be used for fraud detection
- Paired with junior developer to implement
- Wrote frontend and backend code
- Made it easier for the data team to quickly sift through thousands of tables
- Reduced need for humans to classify transactions manually
- Refined frontend, backend, and modeling code for production use
- Analyzed model performance
- Integrated with 3rd parties to collect and parse accounting reports
- Ran analysis to determine healthy accounting patterns
- Created portal for calculating and viewing insights
- Wrote services that respond in under 10ms (including network latency)
- Horizontally scaled web server up to 1 billion requests per day
- Designed micro-service architecture comprising 10+ services
- Productionized machine learning model to compute bids
- Helped scale product from $0 to ~$10M in annual revenue
- Rigorously tested and wrote monitoring
- Generated graph from millions of data points from several sources
- Designed data model where graph was iteratively updated in-place
- Resolved business attributes from a connected component of the graph
- Wrote data pipeline to calculate purchase intent of businesses
- Integrated Tensorflow model from data science team into pipeline
- Architected pipeline to leverage business identity graph
- Led rewrite of service
- Integrated open-source browser fingerprinting
- Wrote jobs to send billions of events into system every month
- Integrated with 3rd parties using cookie matching
- Debugged / resolved problems around user-identity mappings
- Collaborated with Data Science to incorporate intent modeling
Hackathons
- Used a RNN model
- Trained using a GPU
- Wrote email proxy service that obfuscated the sender’s address
- Won the hackathon
- Generated graph of ~8M posts and their relationships
- Used page rank and clustering algorithms to calculate latent clusters
- Visualized using T-SNE
- Used LDA to cluster a few hundred texts from a personal feed
- Visualized text clusters using pyLDAvis