Skip to content
View BenjaminBraunDev's full-sized avatar

Block or report BenjaminBraunDev

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. llm-d-incubation/ig-wva llm-d-incubation/ig-wva Public

    Workload Variant Autoscaler is a service to compute the cost-optimal provisioning of heterogeneous accelerators for inference workloads with varying request latency objectives

    Jupyter Notebook 2 2

  2. gateway-api-inference-extension gateway-api-inference-extension Public

    Forked from kubernetes-sigs/gateway-api-inference-extension

    Gateway API Inference Extension

    Go 2

  3. llm-d llm-d Public

    Forked from llm-d/llm-d

    llm-d is a Kubernetes-native high-performance distributed LLM inference framework

    Shell

  4. kubernetes-sigs/gateway-api-inference-extension kubernetes-sigs/gateway-api-inference-extension Public

    Gateway API Inference Extension

    Go 704 293