Dev Jadhav DevJadhav

Hi there, I'm Dev Jadhav 👋

🚀 Production AI/ML Engineer | Building Enterprise Multi-Agent Systems | AI Safety Advocate

I'm obsessed with the intersection of reliability engineering and autonomous intelligence—building Agentic AI systems that don't just "chat" but act: diagnosing infrastructure, mitigating incidents, and reasoning over complex telemetry with minimal human-in-the-loop latency.

Currently: Lead Machine Learning Engineer @ ING Nederland

🎯 Core Technical Focus

🤖 Agentic Architectures
Designing stateful, multi-agent systems and custom orchestration layers to solve non-deterministic problems in SRE and DevOps.

⚙️ LLMOps & Evals
Engineering rigorous evaluation harnesses and CI/CD pipelines for non-deterministic software, ensuring safety and alignment in enterprise deployments.

⚡ ML System Optimization
Accelerating inference (vLLM, TGI, quantization) and training (3D parallelism: DP/TP/PP) for open-weights models (Mistral, Gemma, Llama) to achieve cost-effective scale.

🔬 Building in Public

🧠 DeepSeek Implementation Series
Built educational implementations of DeepSeek-R1 and DeepSeek-V3.2 from scratch—every tensor operation, every attention mechanism, every expert routing decision.

Three implementations:

Rust + Candle (Metal/GPU) — Inference-optimized for Apple Silicon
PyTorch (CUDA/MPS/CPU) — Distributed training with Flash Attention
MLX — Native Apple Silicon development

Key architectures implemented:

Multi-Head Latent Attention (MLA) — 93% KV cache reduction
DeepSeek Sparse Attention (DSA) — Hybrid local + dilated patterns
256-expert MoE with hierarchical routing
Multi-Token Prediction for improved sample efficiency
5D Parallelism (Tensor, Pipeline, Data, Expert, Sequence)

🔍 Current Research

Model Context Protocol (MCP) for standardizing agent-observability integrations
GNN-based anomaly detection for 3D parallelism in distributed training systems
DualPipe implementation for advanced pipeline parallelism

🛠️ Production AI Engineer

My background spans the full stack of computational hardness—from embedded C++ optimization on ARM microcontrollers to distributed training pipelines on AWS/GCP. This "bits to billions" perspective allows me to build AI systems that are not only intelligent but fundamentally performant and secure.

Tech Stack:

Languages: Python, Rust, Go, C++
ML/AI: PyTorch, JAX, CUDA, MLX, Candle, TensorFlow
LLM Infra: vLLM, TGI, Ray, Modal, DeepSpeed, FSDP, Flash Attention
MLOps: Kubernetes, Docker, Airflow, MLflow, ZeRO, 3D Parallelism
Cloud: AWS (GenAI SME), GCP, Azure

📚 Sharing Knowledge

✍️ Technical Writing
17+ articles on MLwithDev (Medium) covering:

Production MLOps and the messy realities of production AI
Multimodal AI and advanced architectures
Security testing of ML systems
How to make models reliable and handle failures gracefully

🏆 Career Highlights

🏅 Employee of the Year 2022 @ Lox Solution
🏅 AWS Subject Matter Expert — Generative AI Certification
📈 15% accuracy improvement in breast cancer detection using GANs @ ScreenPoint Medical
🚀 40% faster insights, 45% reduced delivery time in multi-cloud AI/ML solutions
⚡ 30% performance gains in production Data&ML systems

💡 What Drives Me

I bring the rare ability to understand both cutting-edge AI architectures and the operational realities of serving them at scale—exactly what's needed to bridge research and production.

My work focuses on the messy realities: How to make models reliable. How to handle failures gracefully. How to build systems that scale beyond proof-of-concept.

🌐 Connect & Collaborate

I'm passionate about production AI systems, multi-agent architectures, and AI safety.

💼 LinkedIn
✉️ devj7594@gmail.com
📝 Medium - MLwithDev
🚀 deviahc.com - SRE Agent Services

💡 Open to: Research collaborations, speaking opportunities at AI/ML conferences, and impactful roles bridging cutting-edge AI research with production systems.

⭐ Currently working on: Making DeepSeek implementations using 3 different backends(rust, Pytorch, MLX)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly