I'm a Data & AI Engineer passionate about designing and building end-to-end data solutions β from raw ingestion to AI-powered insights. I specialize in cloud-native data architectures, real-time streaming pipelines, and machine learning integrations using modern data stack technologies.
- ποΈ Building production-grade ETL/ELT pipelines on Azure, AWS & GCP
- π€ Developing AI/ML-powered data workflows & RAG (Retrieval-Augmented Generation) systems
- β‘ Engineering real-time streaming solutions with Apache Spark, Kafka & Flink
- ποΈ Designing dimensional data models (Star/Snowflake schema) for analytics
- π Optimizing data platforms for scalability, reliability, and performance
- π Microsoft Azure Data Engineer Associate (DP-203) β In Progress
| Project | Description | Tech Stack |
|---|---|---|
| π₯ Apache Spark Portfolio | End-to-end Spark data engineering solutions with local vs. global sort optimizations | PySpark, Scala |
| βοΈ Azure Data Engineer (DP-203) | Azure-based ETL/ELT pipelines β preparation for DP-203 certification | Azure Data Factory, Synapse, ADLS |
| π€ AI Chat RAG Workflow | Retrieval-Augmented Generation pipeline for intelligent document Q&A | Python, LangChain, OpenAI |
| π° News Trend Data Pipeline | Real-time news trend ingestion and analytics pipeline | Python, Airflow, Kafka |
| ποΈ Dimensional Modeling - NBA | Star schema dimensional model for NBA analytics | SQL, PostgreSQL |
| βΈοΈ Kubernetes Data Engineer | Containerized data pipeline deployment with Kubernetes | Kubernetes, Docker, Python |
| π SQL Deep Dive | Advanced SQL techniques: window functions, CTEs, optimization | SQL, Jupyter Notebook |
I'm always open to discussing data engineering, AI/ML projects, cloud architecture, or opportunities in consulting and technology.
β "Turning raw data into actionable intelligence β one pipeline at a time."
