Skip to content
View DevJadhav's full-sized avatar
  • ING Bank
  • Eindhven, Netherland

Block or report DevJadhav

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
devjadhav/README.md

Hi there, I'm Dev Jadhav πŸ‘‹

πŸš€ Production AI/ML Engineer | Building Enterprise Multi-Agent Systems | AI Safety Advocate

I'm obsessed with the intersection of reliability engineering and autonomous intelligenceβ€”building Agentic AI systems that don't just "chat" but act: diagnosing infrastructure, mitigating incidents, and reasoning over complex telemetry with minimal human-in-the-loop latency.

Currently: Lead Machine Learning Engineer @ ING Nederland

🎯 Core Technical Focus

πŸ€– Agentic Architectures
Designing stateful, multi-agent systems and custom orchestration layers to solve non-deterministic problems in SRE and DevOps.

βš™οΈ LLMOps & Evals
Engineering rigorous evaluation harnesses and CI/CD pipelines for non-deterministic software, ensuring safety and alignment in enterprise deployments.

⚑ ML System Optimization
Accelerating inference (vLLM, TGI, quantization) and training (3D parallelism: DP/TP/PP) for open-weights models (Mistral, Gemma, Llama) to achieve cost-effective scale.

πŸ”¬ Building in Public

🧠 DeepSeek Implementation Series
Built educational implementations of DeepSeek-R1 and DeepSeek-V3.2 from scratchβ€”every tensor operation, every attention mechanism, every expert routing decision.

Three implementations:

  • Rust + Candle (Metal/GPU) β€” Inference-optimized for Apple Silicon
  • PyTorch (CUDA/MPS/CPU) β€” Distributed training with Flash Attention
  • MLX β€” Native Apple Silicon development

Key architectures implemented:

  • Multi-Head Latent Attention (MLA) β€” 93% KV cache reduction
  • DeepSeek Sparse Attention (DSA) β€” Hybrid local + dilated patterns
  • 256-expert MoE with hierarchical routing
  • Multi-Token Prediction for improved sample efficiency
  • 5D Parallelism (Tensor, Pipeline, Data, Expert, Sequence)

πŸ” Current Research

  • Model Context Protocol (MCP) for standardizing agent-observability integrations
  • GNN-based anomaly detection for 3D parallelism in distributed training systems
  • DualPipe implementation for advanced pipeline parallelism

πŸ› οΈ Production AI Engineer

My background spans the full stack of computational hardnessβ€”from embedded C++ optimization on ARM microcontrollers to distributed training pipelines on AWS/GCP. This "bits to billions" perspective allows me to build AI systems that are not only intelligent but fundamentally performant and secure.

Tech Stack:

  • Languages: Python, Rust, Go, C++
  • ML/AI: PyTorch, JAX, CUDA, MLX, Candle, TensorFlow
  • LLM Infra: vLLM, TGI, Ray, Modal, DeepSpeed, FSDP, Flash Attention
  • MLOps: Kubernetes, Docker, Airflow, MLflow, ZeRO, 3D Parallelism
  • Cloud: AWS (GenAI SME), GCP, Azure

πŸ“š Sharing Knowledge

✍️ Technical Writing
17+ articles on MLwithDev (Medium) covering:

  • Production MLOps and the messy realities of production AI
  • Multimodal AI and advanced architectures
  • Security testing of ML systems
  • How to make models reliable and handle failures gracefully

πŸ† Career Highlights

  • πŸ… Employee of the Year 2022 @ Lox Solution
  • πŸ… AWS Subject Matter Expert β€” Generative AI Certification
  • πŸ“ˆ 15% accuracy improvement in breast cancer detection using GANs @ ScreenPoint Medical
  • πŸš€ 40% faster insights, 45% reduced delivery time in multi-cloud AI/ML solutions
  • ⚑ 30% performance gains in production Data&ML systems

πŸ’‘ What Drives Me

I bring the rare ability to understand both cutting-edge AI architectures and the operational realities of serving them at scaleβ€”exactly what's needed to bridge research and production.

My work focuses on the messy realities: How to make models reliable. How to handle failures gracefully. How to build systems that scale beyond proof-of-concept.

🌐 Connect & Collaborate

I'm passionate about production AI systems, multi-agent architectures, and AI safety.


πŸ’‘ Open to: Research collaborations, speaking opportunities at AI/ML conferences, and impactful roles bridging cutting-edge AI research with production systems.

⭐ Currently working on: Making DeepSeek implementations using 3 different backends(rust, Pytorch, MLX)

Pinned Loading

  1. Practical-Data-Science-on-the-AWS-Cloud-Specialization Practical-Data-Science-on-the-AWS-Cloud-Specialization Public

    Jupyter Notebook

  2. Udacity-SQL-Nanodegree Udacity-SQL-Nanodegree Public

    1

  3. AI-Programming-With-Python AI-Programming-With-Python Public

    HTML

  4. Microsoft-Azure-ML-Udacity Microsoft-Azure-ML-Udacity Public

    Udacity Nanodegree projects of microsoft Azure ML

    Jupyter Notebook

  5. Udacity-Agile-Software-Development Udacity-Agile-Software-Development Public

  6. Udacity-Data-Engineering-Nanodegree Udacity-Data-Engineering-Nanodegree Public

    Python 1