Skip to main contentSkip to navigationSkip to about sectionSkip to skills sectionSkip to experience sectionSkip to contact sectionSkip to AI assistant
Bhavya Gada

Bhavya Gada

Data Engineer

2+ years designing secure, high-throughput data pipelines in AWS-based enterprise environments

About Me

🛠️ Behind the Data & AI 🧑‍💻

Hello! I'm Bhavya, a passionate Data Engineer with 2+ years of experience designing, building, and operating secure, high-throughput data pipelines in AWS-based, regulated enterprise environments. Proven track record of owning production systems end-to-end, optimizing CDC ingestion and distributed processing to scale data volumes while meeting strict accuracy, latency, and SLA requirements. Strong background in SQL and Python, cloud-native data platforms, and SDLC-driven delivery.

🚀 Current: Data Engineer @ UPS (Contract) Jul 2024 - Present

Own and operate mission-critical payroll and financial reporting datasets supporting downstream analytics, audits, and compliance workflows with strict accuracy, timeliness, and SLA guarantees. Design, develop, and maintain scalable CDC-based ingestion pipelines on AWS using S3 as the system of record, Glue and EMR for distributed processing, and Redshift for analytical storage. Re-architected CDC ingestion workflows enabling reliable processing of 10× higher data volumes while remaining within SLA.

AWS S3Glue & EMRRedshiftCDC Pipelines

🔒 Privacy Data Engineer @ Ardent Privacy Jul 2023 - Jun 2024

Designed and operated compliance-centric data pipelines powering enterprise privacy operations (DSAR, consent, audit, retention) across healthcare, finance, and government clients. Built event-driven ingestion pipelines using Kafka and AWS-native messaging patterns with exactly-once semantics and end-to-end auditability. Embedded privacy controls including data classification, masking, tokenization, and pseudonymization aligned with GDPR and HIPAA requirements.

🎓 Software Developer & Graduate Assistant @ UMBC Sep 2022 - Jun 2023

Engineered and maintained production web systems for university-wide platforms supporting academic and administrative workflows. Led accessibility remediation initiatives to achieve WCAG 2.1 and WCAG 2.2 compliance, aligning with Section 508 and ADA requirements. Developed backend integrations and data-backed services using Python and SQL, supporting reporting and operational use cases.

💻 Full Stack Software Developer @ Virtuals Design Apr 2020 - Aug 2022

Designed and developed backend services for multi-tenant SaaS platforms supporting thousands of daily users and data-driven business workflows. Built and maintained Java-based backend components for transactional processing, data validation, and integration with downstream systems. Developed RESTful APIs using Java, Python, and SQL to support data ingestion, transformation, and reporting use cases.

Tech Expertise

I deliver production-ready data engineering solutions with expertise in AWS data platforms, distributed processing, and CDC-based ingestion pipelines. My focus is on building reliable, scalable data systems that support analytics, reporting, and compliance workloads.

PythonSQLAWSAirflowKafkaGlue & EMRRedshiftDocker & K8s

🏆 Achievements & Certifications

  • 🥇IIT Bombay eYRC Finalist - Autonomous quadcopter rescue system
  • 🏆Smart India Hackathon National Finalist - Analytics dashboard for Adani Ports
  • ☁️Google Cloud Professional Data Engineer & Cloud Architect
  • 🧱Databricks Certified Professional Data Engineer
  • ☁️AWS Solutions Architect Professional
  • ⚙️Certified Kubernetes Administrator (CKA)
  • 🔒IAPP Certified Information Privacy Technologist (CIPT)

💬 Let's Connect!

I love connecting with engineers, students, and innovators! I share insights and provide advice on cloud architecture, data engineering, AI/ML, studying abroad, and career growth.

When I'm not coding or mentoring, you'll find me exploring new AI frameworks and staying at the cutting edge of responsible AI and MLOps.

2+ Years
Data Engineering Experience
Data Engineering
10× Scale
CDC throughput increase
Performance
AWS Native
S3, Glue, EMR, Redshift
Cloud Platforms
HIPAA/GDPR
Privacy & Compliance
Compliance

Technical Skills

Hover over skills for details

Data Engineering

Python
SQL
Git
CI/CD
ANSI SQL
Window Functions
CTEs

BigQuery & Data Platforms

BigQuery
Partitioning
Clustering
Query Optimization
Dimensional Modeling
Data Vault

Batch & Event Pipelines

Airflow
Cloud Pub/Sub
Kafka
Batch Processing
GCP
Cloud Storage
Dataflow

Data Quality & Compliance

Great Expectations
Data Validation
Monitoring & Alerting
Root Cause Analysis
HIPAA
SOC 2
GDPR

Analytics & BI

Looker
LookML
Dashboarding
Self-service BI
Grafana
Tableau

AI Data Systems & On-Call

Vector Embeddings
Knowledge Graphs
FAISS
Dataset Versioning
PagerDuty
Runbooks
SLA Management

Key Achievements

Delivering impactful solutions at scale

CDC Pipeline Architecture

Re-architected CDC ingestion workflows enabling reliable processing of 10× higher data volumes while remaining within SLA using AWS S3, Glue, EMR, and Redshift

10× throughput increase
SLA compliance

On-Call & Incident Response

Primary on-call engineer for data pipelines, performing deep root-cause analysis on production failures and implementing durable fixes to prevent recurrence

SLA adherence
Root-cause analysis

Privacy & Compliance Engineering

Designed compliance-centric pipelines powering DSAR, consent, audit, and retention operations aligned with HIPAA, GDPR, SOC 2, and PCI DSS

Sub-72-hour DSAR SLA
Zero audit exceptions

Query & Cost Optimization

Tuned Redshift workloads through distribution styles, sort keys, and incremental load strategies to improve query performance and stabilize reporting

Reduced costs
Performance tuning

Data Quality & Validation

Implemented automated data quality validations, freshness SLIs, and reconciliation checks at multiple pipeline stages to prevent silent data corruption

Pre-release validation
Automated monitoring

CI/CD & Infrastructure

Supported CI/CD-driven pipeline deployments using infrastructure-as-code and SDLC best practices across development, UAT, and production environments

Safe deployments
Environment parity

Education

Master of Science in Information Systems

University of Maryland Baltimore County

Get In Touch

Let's collaborate on your next big project

🚀

Ready to Build Something Amazing?

I'm always excited to discuss new opportunities, innovative projects, and ways to leverage data and AI to solve complex challenges.

Let's Connect