Releases: kubernetes-sigs/agent-sandbox
Release list
v0.5.0
🚀 Announcing Agent Sandbox v0.5.0!
We're excited to announce the release of Agent Sandbox v0.5.0! This release marks a significant milestone with the official graduation of our APIs to v1beta1, bringing enhanced stability, critical security hardening, and a wealth of new features and improvements across the platform, client SDKs, and examples. Dive in to experience a more robust and developer-friendly Agent Sandbox.
⚠️ Breaking Changes / Action Required
- API Group Upgrade and Deprecation (
v1alpha1tov1beta1):- The core and extension APIs (
agents.x-k8s.ioandextensions.agents.x-k8s.io) have been officially graduated fromv1alpha1tov1beta1. v1alpha1APIs are now deprecated. While multi-version CRD support is introduced with a conversion webhook, users are strongly encouraged to migrate theirv1alpha1resources tov1beta1.- Action Required: Update your manifests and API interactions to use
apiVersion: agents.x-k8s.io/v1beta1andapiVersion: extensions.agents.x-k8s.io/v1beta1. Refer to the API Migration Guide for detailed steps.
- The core and extension APIs (
- Sandbox
spec.replicasRemoved,spec.operatingModeIntroduced:- The
spec.replicasfield has been removed from the Sandbox API and replaced withspec.operatingMode(with valuesRunningandSuspended). - This is a breaking change for any automation or tools that relied on
spec.replicasfor scaling (e.g.,kubectl scale, HorizontalPodAutoscalers, PodDisruptionBudgets). - Action Required: Update your Sandbox manifests and any scaling logic to use
spec.operatingModefor managing Sandbox lifecycle.
- The
- SandboxClaim
spec.templateRefReplaced byspec.warmpoolRef:- The
SandboxClaimAPI no longer usesspec.templateRefor thewarmpoolpolicy field. Instead, claims must explicitly point to aSandboxWarmPoolusingspec.warmpoolRef. - To achieve a cold start without pre-warming, cluster administrators should create a
SandboxWarmPoolwithreplicas: 0for users to reference. - Action Required: Update
SandboxClaimmanifests to referencespec.warmpoolRefpointing to an existingSandboxWarmPoolresource.
- The
- NetworkPolicy Namespace Restriction for
sandbox-router:- The default
NetworkPolicygenerated by theSandboxTemplatecontroller now strictly scopes ingress rules to theagent-sandbox-systemnamespace for thesandbox-router. - Action Required: If your deployments are running the
sandbox-routerin a namespace other thanagent-sandbox-system, you must migrate and deploy thesandbox-routerinsideagent-sandbox-systemprior to or in tandem with upgrading the controller to avoid service interruption.
- The default
Key Highlights
Core API & Platform Stability
- API Graduation & Multi-Version Support: Official graduation of core and extension APIs to
v1beta1, including multi-version CRD support with conversion webhooks forv1alpha1compatibility during migration (#817, #993). - Sandbox Lifecycle Management: Replaced
spec.replicaswithspec.operatingModefor more explicit control over Sandbox suspension and resume behavior (#801). - SandboxClaim Enhancements:
SandboxClaimnow usesspec.warmpoolReffor clearer warm pool association and gained printer columns for improvedkubectl getvisibility (#899, #984). - Optimized Warm Pool Operations: Enabled parallel creation and deletion of sandboxes within
SandboxWarmPoolcontroller, significantly speeding up scale operations (#798). - Improved Warm Pool Selection Strategy: Implemented a smart warm pool selection strategy that prioritizes ready sandboxes, spreads workloads across nodes, and optimizes for in-memory processing, reducing API overhead (#878, #939).
- Resource Adoption & Persistence: Fixed orphan adoption for Sandbox child resources and introduced explicit authorization for unowned resources to prevent hijacking (#944, #784).
- Performance Improvements: Switched
SandboxClaimstatus updates to patching (.Patch()) to reduce conflicts at scale, improving overall system performance (#508). - Helm Chart Enhancements: Added support for
podSecurityContext,containerSecurityContext,podAnnotations, andpodLabelsin the controller Helm chart for better Kubernetes policy compliance and custom metadata injection (#753, #750). - Storage Configuration via SandboxClaim: Introduced support for volume claim templates within SandboxClaims, enabling customized persistent volumes with policy-driven merging (#960).
- Warmpool Label Propagation: Enhanced warmpool label propagation from sandbox to pod, ensuring consistent identification across resources (#927).
- Preserve Zero Replica Counts: Fixed an issue where zero replica counts in warmpool status were not preserved during server-side apply operations (#807).
- Assigned Sandbox Name Storage: Switched to storing assigned Sandbox names in annotations instead of labels to bypass Kubernetes length constraints (#771).
Security & Hardening
- SSRF Protections: Disabled automatic HTTP redirects in both Go and Python SDKs to prevent Server-Side Request Forgery (SSRF) vulnerabilities from untrusted sandbox workloads (#874, #816).
- Router Security: Addressed an unauthenticated internal proxy vulnerability in the sandbox router with strict input validation and optional bearer token authentication (#755).
- Network Policy Enhancements: Default
NetworkPolicynow blocks IPv6 link-local traffic and strictly scopes ingress to theagent-sandbox-systemnamespace for thesandbox-routerfor enhanced isolation (#827, #881). - Build-time Injection Prevention: Sanitized git-derived version strings to prevent build-time command injection vulnerabilities (#946).
- Denial of Service (ReDoS) Fix: Replaced a vulnerable regex matching function with an iterative dynamic programming approach to resolve a ReDoS vulnerability (#935).
- Pod Metadata Protection: Protected system-reserved Pod labels and annotations from tenant override to prevent traffic hijacking or tracking label forging (#894).
- Warm Pool Poisoning Prevention:
isAdoptablefunction now explicitly rejects unowned sandboxes to prevent warm pool poisoning (#875). - OpenTelemetry Trace Sanitization: Sanitized
sandbox.commandattribute in OpenTelemetry traces to prevent sensitive data exposure (#895). - CLI Tool Hardening: Fixed concurrency race conditions and stale PID cleanup issues in
resourcectlCLI utility, preventing data loss and arbitrary process termination (#934, #902).
Client SDK & Developer Experience
- Dynamic Timeout Propagation: SDKs now support dynamic timeout propagation to the sandbox router, ensuring long-running operations are not prematurely terminated (#857).
- Python Async Client Cleanup: Added
cleanup=Truesupport toAsyncSandboxClientfor automatic resource cleanup on program termination (#859). - Python
additionalPodMetadataExposure: ExposedadditionalPodMetadatain the Python client for direct control over Sandbox Pod labels and annotations (#979). - Go Client PodIP Routing: Enabled PodIP routing in the Go client to resolve connection issues when Kubernetes DNS is unavailable for sandbox services (#910).
- Sandbox Client Improvements: Hardened filesystem path sanitization, improved label selectors, and enabled template-verified reattachment in the Python SDK (#695).
- PSS SDK Enhancements: Enabled restoration from dedicated snapshots and filtering by creation timestamp for the Python Snapshot SDK (#799, #732).
- CI/CD & Tooling: Optimized CI staging builds, increased promotion timeouts, and updated
pyyamldependency for CRD sorting during release publish (#1021). Improved AI code review configuration and guidelines for Copilot and CodeRabbit (#938, #936, #947, #866).
Examples & Documentation
- RL & Evals Example: Introduced
agent-sandbox-rl, a complete Python package for multi-cluster warm-pool orchestration of RL and Evals workloads (#1000). - Anthropic Agents Example: Added an example for running Anthropic Managed Agents self-hosted sandboxes on GKE Agent Sandbox (#950).
- Sandboxed Tools Enhancements: Improved
sandboxed-toolsexamples to persist sessions and filesystem state across multiple tool calls and refactored tools into their own package (#888, #887, #877, #886). - MCP Server Example: Provided an example for running an MCP server inside a sandbox with persistent storage (#937).
- AKS Kata Container Example: Added an AKS example demonstrating Kata Containers with sandbox warm pools (#839).
- Ray Integration: Documented an example on how to run a RayJob with Agent Sandbox via direct PodIP (#868, #742).
- Comprehensive Troubleshooting Guide: Added a detailed troubleshooting guide for debugging SDK, custom image, and cluster-level issues (#660).
- API & NetworkPolicy Documentation: Updated documentation to reflect
v1beta1API changes, clarified NodeLocal DNS walkthrough, and expanded NetworkPolicy guidance (#867, #823, #815). - Issue Templates: Added structured GitHub issue templates for bug reports, feature requests, and epics, and improved their ordering (#880, #891).
Installation
Core & Extensions
# To install only the core components:
kubectl apply -f https://github.com/kubernetes-sigs/agent-sandbox/releases/download/v0.5.0/manifest.yaml
# To install the extensions components:
kubectl apply -f https://github.com/kubernetes-sigs/agent-sandbox/releases/download/v0.5.0/extensions.yamlTo upgrade from v0.4.6 to v0.5.0, please follow the detailed steps in [API Migration Guide...
v0.5.0rc1
🚀 Announcing Agent Sandbox v0.5.0rc1!
We're excited to announce the release candidate of Agent Sandbox v0.5.0! This pre-release introduces major API advancements with the v1beta1 upgrade, enhanced warm pool management, critical security hardenings, and expanded developer tooling.
⚠️ Pre-Release Notice
This is a Release Candidate (RC) intended for early testing, validation, and feedback by maintainers and early adopters. It is not recommended for production environments.
Warning
Upgrading existing v1alpha1 API objects to v1beta1 is not yet supported (coming soon); users must install this version in a clean environment (no pre-existing v1alpha1 CRDs or CRs).
⚠️ Breaking Changes / Action Required
- API Group Upgrade (
v1beta1) (#867): The core and extension APIs have been upgraded fromv1alpha1tov1beta1. All example manifests and documentation now reflectv1beta1. SandboxClaimSpecification Overhaul (#899): Thespec.templateReffield inSandboxClaimhas been replaced withspec.warmpoolRefto better reflect warm pool architectural integration.- System-Reserved Metadata Protection (#894): System-reserved Pod labels and annotations are now protected from tenant overrides to prevent privilege escalation and sandbox hijacking.
Key Highlights
-
API Evolution & Stability
- API Graduation to v1beta1: The core Agent Sandbox API has been graduated from
v1alpha1tov1beta1, marking a significant step towards maturity and stability. This involves dropping legacy alpha schemas and updating controllers for parity. - Sandbox Lifecycle Management: Replaced
spec.replicaswith a newspec.operatingModefield (supportingRunningandSuspended) to provide more explicit and granular control over Sandbox suspension and resumption. This is a breaking change. - SandboxClaim API Refinement: The
SandboxClaimAPI now uses aspec.warmPoolRefinstead ofspec.templateRef, simplifying how claims interact with warm pools and enhancing clarity. This is an action-required breaking change. - Granular Sandbox Suspend Condition: Introduced an explicit
Suspendedcondition in the Sandbox status for more accurate tracking of sandbox states, supporting future features like process freezing. - Sandbox Template Ref Hash Propagation: The
sandbox-template-ref-hashlabel is now consistently propagated toSandboxTemplateresources and adopted/cold-path Sandboxes, enabling easier client-side resolution of template-to-sandbox relationships. - Warm Pool Eviction: Implemented warm pool eviction using Cluster Autoscaler annotations, allowing idle, un-adopted Sandboxes to be marked as safe to evict.
- Sandbox Name Annotation: The assigned Sandbox name is now stored in an annotation instead of a label to bypass Kubernetes' 63-character length constraint.
- API Graduation to v1beta1: The core Agent Sandbox API has been graduated from
-
Security Enhancements
- Sandbox Router Hardening: Addressed vulnerabilities related to unauthenticated internal proxying by enforcing strict
sandbox_idvalidation, implementing optional Bearer token authentication, and tightening NetworkPolicy scoping toagent-sandbox-systemnamespace. - Pod Metadata Protection: Prevented tenants from overriding system-reserved Pod labels and annotations (
agents.x-k8s.io/,extensions.agents.x-k8s.io/), mitigating potential traffic hijacking and spoofing. - Resource Hijacking Prevention: Introduced explicit label authorization (
agents.x-k8s.io/adoptable: "true") before Sandboxes can adopt unowned Pods, Services, and PVCs. Previous owned objects can still be adopted without this label. - Python SDK Security: Disabled automatic HTTP redirects in
SandboxConnectorto prevent Server-Side Request Forgery (SSRF) attacks and sanitized OpenTelemetry trace attributes to prevent sensitive data exposure. - CI/Build Security: Fixed a Python module shadowing vulnerability in CI presubmits that could lead to Remote Code Execution (RCE) and added validation for
KATA_VERSIONto prevent path traversal. - IPv6 NetworkPolicy Hardening: The default NetworkPolicy now explicitly blocks IPv6 link-local traffic (
fe80::/10), preventing untrusted code from accessing local services or cloud metadata endpoints. - Resourcectl PID Cleanup: Fixed a logic issue in
resourcectl cleanupthat could lead to arbitrary process termination due to stale heartbeat PIDs. - Analytics Tool Hardening: Patched a security vulnerability in the
examples/analytics-toolallowing bypass of command execution allow-lists.
- Sandbox Router Hardening: Addressed vulnerabilities related to unauthenticated internal proxying by enforcing strict
-
Performance & Scalability
- Parallel Warm Pool Operations: Enabled parallel creation and deletion of sandboxes in the Warm Pool controller, significantly reducing reconciliation times (up to 4.26x faster).
- Warm Pool Selection Optimization: Optimized the NodeSpread sandbox selection strategy to run purely in-memory, drastically reducing API server overhead and improving P99 concurrent claim latency by up to 4x.
- Claim Status Update Optimization: Switched to patching for
SandboxClaimstatus updates to reduce conflicts and improve scalability. - Memory Leak Reduction: Implemented measures to catch memory leaks and reduce per-scrape allocations across controllers and clients.
-
Python & Go SDK Improvements
- Python SDK Client Enhancements: Added support for label selectors, hardened file upload path validation, enabled template-verified reattachment, and introduced
shutdown_after_secondsfor ephemeral sandboxes. - Python SDK Snapshot Restoration: Enabled restoration from dedicated snapshots, allowing sandboxes to be reverted to specific previous states.
- Go SDK PodIP Routing: Implemented
PodIProuting to fix connection issues with local sandbox-router gateways when cluster DNS is not available.
- Python SDK Client Enhancements: Added support for label selectors, hardened file upload path validation, enabled template-verified reattachment, and introduced
-
Enhanced Developer Experience & Tooling
- Standardized GitHub Issue Templates: Added structured YAML templates for bug reports, feature requests, and maintainer epics, along with a
config.ymlfor clearer contact links. - AI Code Review Integration: Configured CodeRabbit for automated PR summaries and walkthroughs, and optimized Copilot instructions to align with project toolchain, linting, and review scope policies.
- Helm Chart Flexibility: Added
podAnnotations,podLabels,podSecurityContext, andcontainerSecurityContextoptions to the controller Helm chart for greater customization and compliance with cluster security policies. - Build System Updates: Bumped Go versions across the repository and updated GitHub Actions dependencies. The PyPI publish process was also updated to allow release candidate versions.
- Standardized GitHub Issue Templates: Added structured YAML templates for bug reports, feature requests, and maintainer epics, along with a
-
Examples & Documentation
- Sandboxed Tools Enhancements: Refactored tools into their own package, added functionality for persisting sessions across invocations, and enabled sandboxes to stay alive over multiple tool calls for faster execution.
- New Example Workloads: Introduced a self-contained example for running an MCP server inside a sandbox with storage persistence, an AKS example using Kata Containers with sandbox warm pools, and a RayJob integration example.
- Comprehensive Documentation Updates: All examples and documentation have been upgraded to reflect the
v1beta1API. New guides include detailed explanations of NetworkPolicy management, NodeLocal DNS with NetworkPolicy, and utilizing Dataplane-v2 for setup.
Installation
Core & Extensions
# To install only the core components:
kubectl apply -f https://github.com/kubernetes-sigs/agent-sandbox/releases/download/v0.5.0rc1/manifest.yaml
# To install the extensions components:
kubectl apply -f https://github.com/kubernetes-sigs/agent-sandbox/releases/download/v0.5.0rc1/extensions.yamlPython SDK
pip install k8s-agent-sandbox==0.5.0rc1Contributors
We extend our sincere thanks to all contributors to this release:
@aditya-shantanu, @AlexBulankou, @armistcxy, @arpitjain099, @chw120, @dependabot[bot], @hrsh1209, @ianchakeres, @janetkuo, @justinsb, @lauragalbraith, @moficodes, @mvanhorn, @patcrombie, @rainwoodman, @rmalani-nv, @ryanzhang-oss, @shaikenov, @shelwinnn, @SHRUTI6991, @shrutiyam-glitch, @tom1299, @tomergee, @vicentefb
👋 New Contributors
- @AlexBulankou made their first contribution in #866
- @armistcxy made their first contribution in #885
- @arpitjain099 made their first contribution in #796
- @hrsh1209 made their first contribution in #753
- @ianchakeres made upwards of their first contribution in #906
- @lauragalbraith made their first contribution in #763
- @mvanhorn made their first contribution in #864
- @patcrombie made their first contribution in #803
- @rainwoodman made their first contribution in #711
- @rmalani-nv made their first contribution in #750
- @ryanzhang-oss made their first contribution in #839
- @shaikenov made their first contribution in #798
- @shelwinnn made their first contribution in #805
- @tom1299 made their first contribution in https://g...
v0.4.6
🚀 Announcing Agent Sandbox v0.4.6!
We're excited to announce the release of Agent Sandbox v0.4.6! This release introduces major scalability enhancements through opt-in Service management, robust developer guidance with AI agent skills, expanded API and Network Policy documentation, and new stateful AI agent examples.
⚠️ Breaking Changes / Action Required
- Service Creation Opt-In (#775, #800): The Sandbox controller no longer creates a headless Service by default for new Sandboxes. This architectural change significantly improves cluster scalability by eliminating kube-proxy and Kubernetes DNS overhead when scaling to thousands of pods. Existing Sandboxes with an auto-provisioned Service are preserved automatically.
- Action Required: For new Sandboxes that require an auto-provisioned headless Service, explicitly set
spec.service: true. To explicitly remove an existing Service, setspec.service: false. - New
servicefield: Sandboxspecand SandboxTemplatespecnow support theserviceboolean field to control the headless Service creation (defaultfalse). If omitted, existing services of Sandboxes will not be removed, to avoid disruption. - Python SDK & Router Integration: The Python SDK and
sandbox-routerhave been updated to support direct Pod IP routing via theX-Sandbox-Pod-IPheader, bypassing Service routing overhead. The SDK gracefully recovers from API server timeouts and disables Pod IP routing if permissions are lacking (falling back to Service routing).
- Action Required: For new Sandboxes that require an auto-provisioned headless Service, explicitly set
Key Highlights
-
Core Stability and Lifecycle Management
Fixed an issue where the sandbox name hash (selector label) was unavailable when a sandbox was scaled down to zero replicas during suspension (#754).status.labelselectoris no longer unset when replicas is 0. If the hash cannot be resolved, suspension fails gracefully with a clear error reason. Added integration tests for suspend/resume on new client instances. -
AI Agent Skills & Developer Guidelines
Introduced specialized AI agent skills in.agents/skills/(k8s-api-conventionsanddev-rules) to guide AI coding assistants contributing to the repository (#766). AddedAGENTS.mdat the repo root covering project layout, build/test/lint flows, codegen rules, and GitHub Copilot/CLA guidelines (#707). Updated.github/copilot-instructions.mdwith Kubernetes API conventions and CLA reminders (#768). -
Enhanced Documentation and Examples
Added comprehensive core API documentation indocs/api.md(#247) and detailed Network Policy management documentation explaining the capabilities and limitations ofnetworkPolicyManagementinSandboxTemplate(#743). Added a new example demonstrating how to deploy the Hermes Agent (hermes-agent.nousresearch.com) inside the Kubernetes Agent Sandbox with persistent storage (volumeClaimTemplates) and custom skill injection via ConfigMaps (#774). Updated the OpenClaw sandbox example to demonstrate usage with the gVisor runtime class on GKE for enhanced sandbox isolation (#475). Added a release automation guide and updated the PR template for release notes (#748, #790). -
CI/CD and Release Automation
Enabled an automated weekly release schedule (Thursdays at 9:00 AM UTC) using GitHub Actions workflows (#783). Migrated Gemini release note generation from static API keys to secure Vertex AI with short-lived Google Cloud IAM credentials (#783). Updated GitHub Actions dependencies (#788).
Installation
Core & Extensions
# To install only the core components:
kubectl apply -f https://github.com/kubernetes-sigs/agent-sandbox/releases/download/v0.4.6/manifest.yaml
# To install the extensions components:
kubectl apply -f https://github.com/kubernetes-sigs/agent-sandbox/releases/download/v0.4.6/extensions.yamlPython SDK
pip install k8s-agent-sandbox==0.4.6Contributors
We extend our sincere thanks to all contributors to this release:
@aleks-stefanovic, @dependabot[bot], @drogovozDP, @fedebongio, @flpanbin, @janetkuo, @shrutiyam-glitch, @vicentefb, @volatilemolotov
👋 New Contributors
• @fedebongio made their first contribution in #774
Full Changelog: v0.4.5...v0.4.6
v0.4.5
🚀 Announcing Agent Sandbox v0.4.5!
We're excited to announce the release of Agent Sandbox v0.4.5! This release brings significant improvements across release automation, Python SDK capabilities, core stability, and extensive documentation, making Agent Sandbox more robust and user-friendly.
⚠️ Breaking Changes
- Python SDK Update: Upgraded GKE PodSnapshot API from
v1alpha1tov1, which requires adding theagents.x-k8s.io/sandbox-name-hashlabel to yourPodSnapshotPolicygrouping rules. Removed support for restoring sandboxes by creating new claims from previous snapshot templates.
Key Highlights
- CI/CD and Release Automation
A major overhaul of the release workflow introduces fully automated tagging (including graceful handling of release candidates and transitions to stable versions), enhanced release note generation with AI-powered summaries and accurate contributor listing, and robust image promotion to thek8s.ioregistry with PR polling. Workflow permissions have been refined, and all GitHub Actions dependencies updated for better reliability. - Python SDK Improvements
The Python SDK now supports the stablev1version of the PodSnapshot API, ensuring better compatibility and introducingsandboxNameHashfor snapshot grouping. A new Prometheus metric,sandbox_client_discovery_latency_ms, has been added to monitor client connection latency across different connection strategies. The SDK client now correctly acceptswarmpoolparameters for sandbox claim creation, resolving cross-namespace adoption issues, and the sandbox router efficiently streams large request bodies instead of buffering them. - Enhanced Documentation and Examples
Documentation has been significantly expanded with new guides forvolumeClaimTemplatesand a quickstart for the Golang client. New examples showcase dynamic scaling ofSandboxWarmPoolwith Horizontal Pod Autoscaler (HPA) and integration with Kueue for admission control and quota management (updated tov1beta2API). Documentation pages have been reordered and cleaned up for improved navigation, and a new PR template has been added to streamline contributions. - Core Stability and Benchmarking
Memory leaks in theextensions/controllerspackage (includingSandboxClaimReconcilerandSimpleSandboxQueue) have been identified and fixed through the integration ofuber.org/goleakfor robust goroutine leak detection, enhancing long-term stability. Benchmarking capabilities are improved with CSV output for easier analysis and better Boskos resource tracking ensuring consistent GCR.io image pushes. - Policy and Security Examples
New Kyverno policy examples have been added and hardened to prevent RBAC privilege escalation for Sandbox workloads, improving the security posture of your deployments.
Installation
Core & Extensions
# To install only the core components:
kubectl apply -f https://github.com/kubernetes-sigs/agent-sandbox/releases/download/v0.4.5/manifest.yaml
# To install the extensions components:
kubectl apply -f https://github.com/kubernetes-sigs/agent-sandbox/releases/download/v0.4.5/extensions.yamlPython SDK
pip install k8s-agent-sandbox==0.4.5Contributors
We extend our sincere thanks to all contributors to this release:
@ArthurKamalov, @CodesbyUnnati, @alimx07, @dependabot, @chw120, @drogovozDP, @janetkuo, @justinsb, @moficodes, @pandaji, @realshuting, @shrutiyam-glitch, @sohanpatil, @vicentefb, @volatilemolotov
👋 New Contributors
@CodesbyUnnati made their first contribution in #684
@pandaji made their first contribution in #618
@sohanpatil made their first contribution in #715
@realshuting made their first contribution in #682
Full Changelog: v0.4.3...v0.4.5
v0.4.3
🚀 Announcing Agent Sandbox v0.4.3!
We are excited to announce the release of Agent Sandbox v0.4.3!
This release expands documentation significantly, with new guides covering filesystems, volumes, lifecycle, snapshots, metrics, custom environments, and a Python SDK quickstart — plus a reorganized Use Cases section and Go sandbox client docs. It introduces new lifecycle APIs, including a Finished condition on Sandbox and SandboxClaim and a ttlSecondsAfterFinished field for automatic cleanup of finished claims, alongside volumeClaimTemplates support for persistent storage. Warm pool correctness improves with fixes for duplicate Sandbox adoption during informer cache lag, retain-policy deletion, and cross-namespace adoption protection. Observability gains a --version flag, an agent_sandbox_build_info Prometheus metric, and more accurate startup-latency tracking. Finally, the Python SDK adds a new SandboxInClusterConnectionConfig so in-cluster clients can bypass the router — using stable cluster DNS by default, with an opt-in low-latency pod-IP mode.
Key Highlights
- Documentation Expansion: Comprehensive new docs covering filesystems, volumes, lifecycle/TTL, snapshots, metrics, custom environments, and use cases. Added a Python SDK quickstart, Go sandbox client docs, and a homepage Use Cases grid. The
docs/examples/section has been reorganized underdocs/use-cases/. - Lifecycle & Cleanup: New
Finishedcondition onSandbox(mirrored toSandboxClaim) reportingPodSucceeded/PodFailed. NewttlSecondsAfterFinishedfield onSandboxClaim.spec.lifecyclefor automatic cleanup of finished claims, honoring the existingshutdownPolicy(Retain,Delete,DeleteForeground). - Storage: Added
volumeClaimTemplatessupport toSandboxTemplate, propagated throughSandboxClaimandSandboxWarmPoolto the underlyingSandbox. PVC-backed volumes use StatefulSet-style merge semantics with the pod template. - Warm Pool Correctness: Fixed duplicate Sandbox adoption during informer cache lag by recording the adopted Sandbox name on the claim via the
agents.x-k8s.io/sandbox-namelabel. Fixed warm-pool sandbox deletion whenshutdownPolicy: Retainis set. Added cross-namespace adoption protection. - Python SDK Enhancements: New
SandboxInClusterConnectionConfigfor in-cluster clients to bypass the router — defaults to stable cluster DNS, with an opt-inuse_pod_ip=Truemode for low-latency direct pod connections (with cache invalidation on errors).Sandbox.status.podIPsis now exposed end-to-end. - Observability: Added a
--versionflag andagent_sandbox_build_infoPrometheus metric (with git version, SHA, build date, Go version, platform). ImprovedControllerStartupLatencyaccuracy by keying observed-time entries on both name and UID, so recreated claims with the same name no longer reuse stale timestamps. - Stability: SandboxClaim now requeues quietly (instead of erroring) when its template is missing and recovers automatically when the template is created. Sandbox controller now safely handles
AlreadyExistson pod creation, refusing to adopt pods owned by other controllers. - Testing: Go unit tests now run with
-raceenabled by default; newmake test-e2e-racetarget. Added a kOps-on-GCP benchmark scenario and aresourcectlCLI for Boskos resource management.
Installation
Core & Extensions
# To install only the core components:
kubectl apply -f https://github.com/kubernetes-sigs/agent-sandbox/releases/download/v0.4.3/manifest.yaml
# To install the extensions components:
kubectl apply -f https://github.com/kubernetes-sigs/agent-sandbox/releases/download/v0.4.3/extensions.yamlPython SDK
pip install k8s-agent-sandbox==0.4.3Contributors
A huge thank you to all the contributors who made this release possible!
@angristan, @noeljackson, @vgunapati, @igooch, @chw120, @yashasvimisra2798, @bittermandel, @Oneimu, @alimx07, @aleks-stefanovic, @justinsb, @janetkuo, @dongjiang1989, @rayowang, @vicentefb, @volatilemolotov
👋 New Contributors
@vgunapati made their first contribution in #489
@alimx07 made their first contribution in #645
@bittermandel made their first contribution in #583
@angristan made their first contribution in #240
Full Changelog: v0.4.2...v0.4.3
v0.4.2
🚀 Announcing Agent Sandbox v0.4.2!
We are excited to announce the release of Agent Sandbox v0.4.2!
This release introduces significant enhancements to the Python SDK with asynchronous operations, improves controller stability through optimized warm pool management, and strengthens observability with precision latency tracking and debug endpoints. Additionally, the Python SDK expands lifecycle management capabilities with native support for suspend and resume using GKE Pod snapshots, alongside a wealth of new documentation and examples to accelerate AI agent development.
Key Highlights
- Python SDK Advancements: Introduced a new asynchronous Python client to enable non-blocking sandbox operations. The SDK now also supports specifying
spec.lifecycleinSandboxClaimat creation time and provides new methods for status retrieval and environment variable injection. - Enhanced Lifecycle & Snapshots: Implemented suspend and resume functionality in Python SDK leveraging GKE Pod snapshots, allowing for stateful sandbox persistence. Users can also configure auto-cleanup based on boolean flags for better resource management.
- Core Stability & Warm Pool Optimization: The
SandboxWarmPoolnow automatically recreates sandboxes upon template updates. Retrieval performance is significantly improved with a new queue-based mechanism for warm sandbox adoption. - Precision Observability & Debugging: Improved the precision of controller startup latency metrics using event predicates. A new
fgprofdebug endpoint has been added for advanced Off-CPU time analysis to assist in performance tuning. - Expanded Documentation & Examples: Launched a new quickstart example and a dedicated examples section on the website. Architecture diagrams have been updated to highlight extension points, and new guides for OTel Collector deployment are now available.
Installation
Core & Extensions
# To install only the core components:
kubectl apply -f https://github.com/kubernetes-sigs/agent-sandbox/releases/download/v0.4.2/manifest.yaml
# To install the extensions components:
kubectl apply -f https://github.com/kubernetes-sigs/agent-sandbox/releases/download/v0.4.2/extensions.yamlPython SDK
pip install k8s-agent-sandbox==0.4.2Contributors
A huge thank you to all the contributors who made this release possible!
@AndyJCai, @igooch, @rayowang, @xiaoj655, @vicentefb, @shrutiyam-glitch, @SHRUTI6991, @aleks-stefanovic, @dongjiang1989, @aditya-shantanu, @brandonroyal, @volatilemolotov, @Iceber, @framsouza, @Pepper-rice, @chw120, @justinsb, @janetkuo, @kincoy, @dependabot, @Oneimu, @wllbo
New Contributors
- @AndyJCai made their first contribution in #510
- @rayowang made their first contribution in #535
- @xiaoj655 made their first contribution in #371
- @framsouza made their first contribution in #477
- @Pepper-rice made their first contribution in #469
- @dependabot[bot] made their first contribution in #625
Full Changelog: v0.3.10...v0.4.2
v0.3.10
⚠️ Breaking Changes
- SandboxWarmPool now creates Sandbox CRs instead of bare Pods. Existing warm pool pods from previous versions will be orphaned and should be manually deleted (
kubectl delete pods --all-namespaces -l "agents.x-k8s.io/pool,!agents.x-k8s.io/sandbox-name-hash"). This also means the Sandbox created from the SandboxWarmPool will have a random suffix in its name, instead of matching SandboxClaim. - SandboxClaim
status.Nameis renamed tostatus.nameto follow Kubernetes naming conventions.
What's Changed
- Langchain updates - Issue #344 by @aleks-stefanovic in #355
- Add a load test script file which will be called by the periodic prow job by @SHRUTI6991 in #402
- add more people as approvers to sub folders by @aditya-shantanu in #406
- perf(warmpool): implement server-side apply status patch by @vicentefb in #387
- Improving docs by @janetkuo in #423
- Added Architecture by @yashasvimisra2798 in #380
- [Part 1] Create a new Persistent Sandbox Handle by @SHRUTI6991 in #382
- Add
selectorfield to status / scale sub-resources. by @juli4n in #417 - Add Barni as an overall approver. by @aditya-shantanu in #429
- Improve sandboxclaim controller management of multiple worker threads by @igooch in #391
- Concurrency tests: Turn up concurrency on the controllers and do E2E tests. by @aditya-shantanu in #398
- Implement is_restored_from_snapshot method in PodSnapshotSandboxClient by @shrutiyam-glitch in #415
- Document prow process managed outside of repo by @janetkuo in #434
- feat: add kube-api-linter checker by @dongjiang1989 in #421
- feat: implement agent_sandbox_creation_latency_ms metric by @chw120 in #425
- refactor: warm pool creates full Sandbox CRs instead of bare pods by @noeljackson in #395
- fix: update comment for Selector field in SandboxWarmPoolStatus by @yongruilin in #438
- Adds new logo and reference in README by @brandonroyal in #447
- fixed broken icon links on README and site by @brandonroyal in #450
- Re-Implement worker thread collision avoidance for warm pool adoption by @igooch in #437
- feat|fix: add fix-gomod presubmit to verify go modules are tidy by @yongruilin in #442
- Fixes issues with mermaid website rendering by @brandonroyal in #451
- fix: fix broken links and mermaid diagram on doc website by @yongruilin in #452
- feat: implement agent_sandboxes point-in-time metrics by @chw120 in #410
- Load test recipes with cluster loader 2. by @SHRUTI6991 in #267
- Update OpenClaw example by @janetkuo in #471
- tests: fix timing flake in TestSandboxShutdownTime by @igooch in #473
- feat(test): Add ClusterLoader2 performance test by @igooch in #468
- Change(crd): rename 'Name' to 'name' in sandboxclaim_types CRD by @dongjiang1989 in #440
- Implementation 2: Integrate Sandbox handle with SandboxClient. by @SHRUTI6991 in #467
- Add DeleteForeground shutdown policy for SandboxClaim by @yongruilin in #480
- Add documentation to run prow job locally for testing. by @SHRUTI6991 in #472
- Add aditya-shantanu as an additional approver for test + site folder by @aditya-shantanu in #495
- add log to show concurrency settings by @aditya-shantanu in #496
- add specific exceptions from
RuntimeExceptioninSandboxClientby @nadolskit in #427 - Add debug logging to SandboxClaim controller by @igooch in #498
- Fix documentation to reflect the new sandbox client update. by @SHRUTI6991 in #499
- Website Redesign by @moficodes in #200
- Website reorg + fixing all the links by @aleks-stefanovic in #257
- Minor fixes for website overview page bug by @aleks-stefanovic in #506
- feat: add claim_labels support to SandboxClient by @jabra in #492
- versionbump: google.golang.org/grpc v1.79.3 by @Oneimu in #441
- Change
sorttoslicespackage by @dongjiang1989 in #503 - Allow sandbox claim to specify whether to get pod from sandbox warm pool by @PersistentJZH in #212
- Update Chrome sandbox README with e2e usage and Sandbox CRD context by @yashasvimisra2798 in #476
- add support for AKS Sandbox via vscode-sandbox example by @devigned in #236
- Add
Nonehandling to sandbox/gateway/snapshotwaitby @nadolskit in #414 - add analytics guide by @drogovozDP in #184
- feat: Add an example of using Agent Sandbox and Kata on GKE cluster by @maqiuyujoyce in #230
- feat: add KEP-0174 for label and annotation propagation to sandbox pods by @chw120 in #439
- fix: add pvc annotations/labels from volume claim tpl by @dhenkel92 in #217
- Add replicas default value in sandbox crd by @dongjiang1989 in #456
- Add a micro benchmark for using SandboxClaims with Chrome Sandbox. by @aditya-shantanu in #353
- Unify SandboxPodNameAnnotation setting by @hzxuzhonghu in #272
- tests: print object state after waiting for predicate by @justinsb in #520
- Migrate to controller-runtime v0.23+ Event Recorder API by @dongjiang1989 in #458
- feat: add PodIPs to Sandbox and SandboxClaim status by @noeljackson in #518
- feat: add Go sandbox client SDK by @wllbo in #424
- fix: verify pod ownership before operating on annotated pods by @ArmandoHerra in #419
- optimization: replace .Update() with .Patch() for sandbox updateStatus by @vicentefb in #509
- Implement list and delete snapshot functionality in Python SDK by @shrutiyam-glitch in #448
- fix(python): add backward compatibility for SandboxClaim status.sandbox.Name by @kincoy in #515
- Add ControllerStartupLatency metric for SandboxClaims by @igooch in #522
New Contributors
- @yashasvimisra2798 made their first contribution in #380
- @juli4n made their first contribution in #417
- @chw120 made their first contribution in #425
- @noeljackson made their first contribution in #395
- @brandonroyal made their first contribution in #447
- @nadolskit made their first contribution in #427
- @moficodes made their first contribution in #200
- @jabra made their first contribution in #492
- @PersistentJZH made their first contribution in #212
- @devigned made their first contribution in #236
- @drogovozDP made their first contribution in #18...
v0.2.1
🚀 Announcing Agent Sandbox v0.2.1!
We are excited to announce the release of Agent Sandbox v0.2.1!
This release introduces a major shift to a "Secure by Default" networking architecture, enforcing strict isolation for AI agents while providing a highly scalable shared policy model. Alongside these security and architectural advancements, this version strengthens observability with new telemetry metrics, enhances controller stability through a migration to the Deployment model, and expands the Python SDK capabilities with Pod Snapshots and native Kubernetes client support.
⚠️ Breaking Changes
- Controller Migration (StatefulSet to Deployment): The core controller has been migrated from a StatefulSet to a Deployment, and leader election is now enabled by default. Action Required: You must delete the existing StatefulSet before deploying the new version to avoid conflicts by running
kubectl delete statefulset agent-sandbox-controller -n agent-sandbox-system(#191). - Metrics Service Port Update: The metrics Service port has been changed from
80to8080to align with standard practices and avoid traffic conflicts. Action Required: Update any customServiceMonitorresources or Prometheus scraping configurations to target port8080(#366). - Secure-by-Default Network Isolation: SandboxTemplates that do not explicitly define a network policy now default to a strict isolation posture. This blocks access to internal cluster IPs, VPC subnets, and the node metadata server. Action Required: If your agents require access to internal services, you must explicitly define these rules in your
SandboxTemplateor opt out by setting theSandboxTemplate'sspec.networkPolicyManagementfield toUnmanaged(#287).
Key Highlights
- Secure by Default Networking & Scalability: Implemented a strict security baseline for all sandboxes. If no policy is specified, the controller automatically blocks access to internal cluster IPs, VPC subnets, and the node metadata server. To ensure scalability, a single shared NetworkPolicy is now managed per
SandboxTemplaterather than per individual sandbox, enabling instant fleet-wide updates with minimal API overhead. - Multi-Language SDK Advancements:
- Typed Go Client: Introduced a native Kubernetes Go client generated via
client-gen, allowing Go developers to interact with Agent Sandbox resources using standard, type-safe Kubernetes patterns. - Python SDK Advancements: Added support for GKE Pod Snapshots, enabling users to capture the state of running sandboxes. The SDK now features native Kubernetes client generation and new file management methods (
listandexists).
- Typed Go Client: Introduced a native Kubernetes Go client generated via
- Improved Observability & Metrics: Introduced new metrics to track sandbox lifecycles, including
agent_sandbox_claim_startup_latency_msandagent_sandbox_claim_creation_total. Metrics and healthz container ports are now explicitly defined for better networking transparency. - Controller Stability & Scaling: The core controller has been migrated from a StatefulSet to a Deployment for better lifecycle management. It now supports controller concurrency, configurable router timeouts, and enhanced leader election settings.
- Robust Testing Infrastructure: The test suite now uses a watch-based mechanism instead of polling for more accurate results and captures detailed logs (including kubelet and containerd) into artifacts for easier debugging. A new load test using
clusterloader2has been added to simulate high-density sandbox environments.
Installation
Core & Extensions
# To install only the core components:
kubectl apply -f https://github.com/kubernetes-sigs/agent-sandbox/releases/download/v0.2.1/manifest.yaml
# To install the extensions components:
kubectl apply -f https://github.com/kubernetes-sigs/agent-sandbox/releases/download/v0.2.1/extensions.yamlPython SDK
pip install k8s-agent-sandbox==0.2.1Contributors
A huge thank you to all the contributors who made this release possible!
@antonipp, @mastersingh24, @SHRUTI6991, @igooch, @shrutiyam-glitch, @jkallogjeri, @justinsb, @runzhliu, @janetkuo, @vicentefb, @acsoto, @Oneimu, @sabre1041, @e-minguez, @Aliexe-code, @tp953704, @aditya-shantanu, @dongjiang1989, @tomergee, @shreyas-badiger, @esposem, @yongruilin
👋 New Contributors
- @e-minguez made their first contribution in #302
- @sabre1041 made their first contribution in #301
- @runzhliu made their first contribution in #281
- @jkallogjeri made their first contribution in #259
- @Aliexe-code made their first contribution in #332
- @tp953704 made their first contribution in #333
- @Oneimu made their first contribution in #298
- @dongjiang1989 made their first contribution in #364
- @mastersingh24 made their first contribution in #233
- @esposem made their first contribution in #377
- @shreyas-badiger made their first contribution in #374
- @yongruilin made their first contribution in #389
Full Changelog: v0.1.1...v0.2.1
v0.1.1
🚀 Announcing Agent Sandbox v0.1.1!
We are excited to announce the release of Agent Sandbox v0.1.1!
This release brings significant improvements to documentation, observability, extensibility, and stability, along with several new examples to help you get started.
Key Highlights
- New Documentation Site: We have launched a dedicated https://agent-sandbox.sigs.k8s.io/ site to make it easier to find guides and references.
- OpenTelemetry Support: Added optional OpenTelemetry tracing to both the Python client and the Controllers, improving observability for your agentic workloads.
- Enhanced Capabilities:
- Shutdown Policy: Support for configurable Sandbox/SandboxClaim shutdown policies and shutdown times.
- Extensions: Better management for extension deployments, including automount and NetworkPolicy support.
- Critical Fixes & Stability:
- gVisor Support in Python SDK: Major Python client refactor enabling full gVisor (
runsc) compatibility. - WarmPool Reliability: Fixed pod adoption logic, metadata propagation, and prioritization of "Ready" pods.
- Lifecycle Management: Resolved repeated expiry cleanup loops.
- gVisor Support in Python SDK: Major Python client refactor enabling full gVisor (
- New Examples: Explore new examples including Gemini Computer Use, ADK Agent, and a Moltbot example.
Installation
# To install only the core components:
kubectl apply -f https://github.com/kubernetes-sigs/agent-sandbox/releases/download/v0.1.1/manifest.yaml
# To install the extensions components:
kubectl apply -f https://github.com/kubernetes-sigs/agent-sandbox/releases/download/v0.1.1/extensions.yamlContributors
A huge thank you to all the contributors who made this release possible!
@janetkuo, @volatilemolotov, @igooch, @antonipp, @mlgarchery, @shrutiyam-glitch, @lizzzcai, @barney-s, @sdowell, @vicentefb, @Iceber, @acsoto, @ArthurKamalov, @tomergee, @peterzhongyi, @hzxuzhonghu, @aditya-shantanu, @SHRUTI6991, @alex-akv, @bilalshaikh42, @justinsb
👋 New Contributors
- @igooch made their first contribution in #159
- @antonipp made their first contribution in #157
- @mlgarchery made their first contribution in #179
- @shrutiyam-glitch made their first contribution in #172
- @lizzzcai made their first contribution in #152
- @Iceber made their first contribution in #185
- @acsoto made their first contribution in #209
- @ArthurKamalov made their first contribution in #195
- @hzxuzhonghu made their first contribution in #222
- @aditya-shantanu made their first contribution in #218
- @SHRUTI6991 made their first contribution in #220
- @alex-akv made their first contribution in #186
- @bilalshaikh42 made their first contribution in #241
Full Changelog: v0.1.0...v0.1.1
v0.1.0
🚀 Announcing Agent Sandbox v0.1.0!
We are thrilled to announce the first official release of Agent Sandbox, v0.1.0!
This release marks a major milestone, providing a powerful and flexible platform for managing isolated, stateful, singleton workloads in Kubernetes, ideal for use cases like AI agent runtimes. With v0.1.0, you can:
- Define and manage sandboxes declaratively using the new
Sandbox,SandboxTemplate, andSandboxClaimAPIs. - Run a variety of workloads in isolated environments, as demonstrated by our examples.
- Improve performance with
SandboxWarmPool, allowing for faster sandbox creation.
This release is the culmination of the hard work of our contributors, and we're excited to see what you build with it!
Installation
# To install only the core components:
kubectl apply -f https://github.com/kubernetes-sigs/agent-sandbox/releases/download/v0.1.0/manifest.yaml
# To install the extensions components:
kubectl apply -f https://github.com/kubernetes-sigs/agent-sandbox/releases/download/v0.1.0/extensions.yamlContributors
A huge thank you to all the contributors who made this release possible!
@janetkuo, @barney-s, @justinsb, @ameukam, @sdowell, @vicentefb, @tomergee, @flpanbin, @peterzhongyi, @YaoZengzeng, and @volatilemolotov.
Full Changelog: https://github.com/kubernetes-sigs/agent-sandbox/commits/v0.1.0