Things I've Built
A mix of professional work projects and personal learning experiments. Some saved money, some were just for fun, all taught me something useful.
Snowflake Performance Optimization Framework
Built an automated testing framework in Python and Snowpark to analyze query performance and optimize warehouse sizing. This wasn't just theory; it actually reduced our Snowflake compute costs by 38% while improving query response times.
π 38% cost reduction
ERP Migration Data Infrastructure
Key data engineering resource on Sub-Zero's migration from Infor XA to SAP. Building dimensional models, ETL pipelines, and the analytics infrastructure that'll support their new system. Using dbt for transformations, Fivetran for ingestion, and Terraform for infrastructure as code.
π Enterprise-wide migration
HIPAA-Compliant Security Framework
Designed and implemented row-level and column-level security patterns in Snowflake to handle PHI data. This was for HITRUST certification, so it had to be airtight. Created reusable patterns that other teams could implement without reinventing the wheel.
π HITRUST certification
Teradata to Snowflake Migration
Migrated DataStage/Teradata/Unix pipelines to Snowflake; rebuilt ETL with Python and Azure Data Factory and improved reliability by reducing batch failures.
π 21% faster pipelines, 81% less manual work (Snowflake optimization)
NBA Data Warehouse
Final project for my Data Warehousing class at BU. Built a full star schema warehouse to analyze NBA game statistics. Fact tables for scoring, dimension tables for players, teams, coaches, and a custom date dimension for seasons. This is where I learned dimensional modeling before doing it professionally.
π Academic capstone project
Job Application Tracking Database
When I was applying to jobs, I built my own database to track applications, communications, assessments, and offers. Properly normalized tables, stored procedures, the whole deal. Seemed easier than using a spreadsheet, and it gave me practice with SQL during my job search.
π Personal productivity tool
Beijing Air Quality Analysis
For my master's final project, I analyzed Beijing air quality data and weather patterns to predict pollution levels. Used R and machine learning models to identify patterns. Living in China gave me firsthand experience with Beijing's air quality issues; this project let me actually dig into the data.
π Academic research project
Data Quality Monitoring Platform
Set up comprehensive data quality monitoring using dbt tests and Monte Carlo. Configured automated alerts, data lineage tracking, and validation rules. Now when something breaks, we actually know about it before the business does.
π Proactive data reliability