Skip to content
View jvoltci's full-sized avatar
πŸ’­
Building the machinery beneath mind.
πŸ’­
Building the machinery beneath mind.
  • Varanasi

Highlights

  • Pro

Organizations

@ivehement

Block or report jvoltci

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
jvoltci/README.md

Jai Prakashsingh β€” LLM Inference & AI Systems Engineer

Going deep on the layer below the model: LLM serving engines, KV-cache and attention internals, and GPU kernels, all built from scratch.

  • 🌐 jvoltci.github.io: the climb, and the log
  • πŸ“š Mosaic: my open course on AI systems, ML compilers, and inference (7 tracks)
  • πŸ”— LinkedIn
  • πŸ›  Currently building: a from-scratch LLM inference engine (mini-vLLM). Benchmarks soon.

Pinned Loading

  1. stream-md stream-md Public

    Streaming markdown for LLMs. 300x fewer chars parsed per token.

    TypeScript 1

  2. naina naina Public

    An embeddable computer-vision runtime for face & person understanding. C++ core, plug-and-play bindings, runs everywhere β€” Pi to phone to GPU server.

    C++

  3. ivehement/saf ivehement/saf Public

    Flutter plugin that leverages Storage Access Framework (SAF) API to get access and perform the operations on files and folders.

    Kotlin 25 39