Skip to content

Releases: kubernetes-sigs/agent-sandbox

v0.5.0

Choose a tag to compare

@github-actions github-actions released this 24 Jun 20:57
a1a58a0

🚀 Announcing Agent Sandbox v0.5.0!

We're excited to announce the release of Agent Sandbox v0.5.0! This release marks a significant milestone with the official graduation of our APIs to v1beta1, bringing enhanced stability, critical security hardening, and a wealth of new features and improvements across the platform, client SDKs, and examples. Dive in to experience a more robust and developer-friendly Agent Sandbox.

⚠️ Breaking Changes / Action Required

  • API Group Upgrade and Deprecation (v1alpha1 to v1beta1):
    • The core and extension APIs (agents.x-k8s.io and extensions.agents.x-k8s.io) have been officially graduated from v1alpha1 to v1beta1.
    • v1alpha1 APIs are now deprecated. While multi-version CRD support is introduced with a conversion webhook, users are strongly encouraged to migrate their v1alpha1 resources to v1beta1.
    • Action Required: Update your manifests and API interactions to use apiVersion: agents.x-k8s.io/v1beta1 and apiVersion: extensions.agents.x-k8s.io/v1beta1. Refer to the API Migration Guide for detailed steps.
  • Sandbox spec.replicas Removed, spec.operatingMode Introduced:
    • The spec.replicas field has been removed from the Sandbox API and replaced with spec.operatingMode (with values Running and Suspended).
    • This is a breaking change for any automation or tools that relied on spec.replicas for scaling (e.g., kubectl scale, HorizontalPodAutoscalers, PodDisruptionBudgets).
    • Action Required: Update your Sandbox manifests and any scaling logic to use spec.operatingMode for managing Sandbox lifecycle.
  • SandboxClaim spec.templateRef Replaced by spec.warmpoolRef:
    • The SandboxClaim API no longer uses spec.templateRef or the warmpool policy field. Instead, claims must explicitly point to a SandboxWarmPool using spec.warmpoolRef.
    • To achieve a cold start without pre-warming, cluster administrators should create a SandboxWarmPool with replicas: 0 for users to reference.
    • Action Required: Update SandboxClaim manifests to reference spec.warmpoolRef pointing to an existing SandboxWarmPool resource.
  • NetworkPolicy Namespace Restriction for sandbox-router:
    • The default NetworkPolicy generated by the SandboxTemplate controller now strictly scopes ingress rules to the agent-sandbox-system namespace for the sandbox-router.
    • Action Required: If your deployments are running the sandbox-router in a namespace other than agent-sandbox-system, you must migrate and deploy the sandbox-router inside agent-sandbox-system prior to or in tandem with upgrading the controller to avoid service interruption.

Key Highlights

Core API & Platform Stability

  • API Graduation & Multi-Version Support: Official graduation of core and extension APIs to v1beta1, including multi-version CRD support with conversion webhooks for v1alpha1 compatibility during migration (#817, #993).
  • Sandbox Lifecycle Management: Replaced spec.replicas with spec.operatingMode for more explicit control over Sandbox suspension and resume behavior (#801).
  • SandboxClaim Enhancements: SandboxClaim now uses spec.warmpoolRef for clearer warm pool association and gained printer columns for improved kubectl get visibility (#899, #984).
  • Optimized Warm Pool Operations: Enabled parallel creation and deletion of sandboxes within SandboxWarmPool controller, significantly speeding up scale operations (#798).
  • Improved Warm Pool Selection Strategy: Implemented a smart warm pool selection strategy that prioritizes ready sandboxes, spreads workloads across nodes, and optimizes for in-memory processing, reducing API overhead (#878, #939).
  • Resource Adoption & Persistence: Fixed orphan adoption for Sandbox child resources and introduced explicit authorization for unowned resources to prevent hijacking (#944, #784).
  • Performance Improvements: Switched SandboxClaim status updates to patching (.Patch()) to reduce conflicts at scale, improving overall system performance (#508).
  • Helm Chart Enhancements: Added support for podSecurityContext, containerSecurityContext, podAnnotations, and podLabels in the controller Helm chart for better Kubernetes policy compliance and custom metadata injection (#753, #750).
  • Storage Configuration via SandboxClaim: Introduced support for volume claim templates within SandboxClaims, enabling customized persistent volumes with policy-driven merging (#960).
  • Warmpool Label Propagation: Enhanced warmpool label propagation from sandbox to pod, ensuring consistent identification across resources (#927).
  • Preserve Zero Replica Counts: Fixed an issue where zero replica counts in warmpool status were not preserved during server-side apply operations (#807).
  • Assigned Sandbox Name Storage: Switched to storing assigned Sandbox names in annotations instead of labels to bypass Kubernetes length constraints (#771).

Security & Hardening

  • SSRF Protections: Disabled automatic HTTP redirects in both Go and Python SDKs to prevent Server-Side Request Forgery (SSRF) vulnerabilities from untrusted sandbox workloads (#874, #816).
  • Router Security: Addressed an unauthenticated internal proxy vulnerability in the sandbox router with strict input validation and optional bearer token authentication (#755).
  • Network Policy Enhancements: Default NetworkPolicy now blocks IPv6 link-local traffic and strictly scopes ingress to the agent-sandbox-system namespace for the sandbox-router for enhanced isolation (#827, #881).
  • Build-time Injection Prevention: Sanitized git-derived version strings to prevent build-time command injection vulnerabilities (#946).
  • Denial of Service (ReDoS) Fix: Replaced a vulnerable regex matching function with an iterative dynamic programming approach to resolve a ReDoS vulnerability (#935).
  • Pod Metadata Protection: Protected system-reserved Pod labels and annotations from tenant override to prevent traffic hijacking or tracking label forging (#894).
  • Warm Pool Poisoning Prevention: isAdoptable function now explicitly rejects unowned sandboxes to prevent warm pool poisoning (#875).
  • OpenTelemetry Trace Sanitization: Sanitized sandbox.command attribute in OpenTelemetry traces to prevent sensitive data exposure (#895).
  • CLI Tool Hardening: Fixed concurrency race conditions and stale PID cleanup issues in resourcectl CLI utility, preventing data loss and arbitrary process termination (#934, #902).

Client SDK & Developer Experience

  • Dynamic Timeout Propagation: SDKs now support dynamic timeout propagation to the sandbox router, ensuring long-running operations are not prematurely terminated (#857).
  • Python Async Client Cleanup: Added cleanup=True support to AsyncSandboxClient for automatic resource cleanup on program termination (#859).
  • Python additionalPodMetadata Exposure: Exposed additionalPodMetadata in the Python client for direct control over Sandbox Pod labels and annotations (#979).
  • Go Client PodIP Routing: Enabled PodIP routing in the Go client to resolve connection issues when Kubernetes DNS is unavailable for sandbox services (#910).
  • Sandbox Client Improvements: Hardened filesystem path sanitization, improved label selectors, and enabled template-verified reattachment in the Python SDK (#695).
  • PSS SDK Enhancements: Enabled restoration from dedicated snapshots and filtering by creation timestamp for the Python Snapshot SDK (#799, #732).
  • CI/CD & Tooling: Optimized CI staging builds, increased promotion timeouts, and updated pyyaml dependency for CRD sorting during release publish (#1021). Improved AI code review configuration and guidelines for Copilot and CodeRabbit (#938, #936, #947, #866).

Examples & Documentation

  • RL & Evals Example: Introduced agent-sandbox-rl, a complete Python package for multi-cluster warm-pool orchestration of RL and Evals workloads (#1000).
  • Anthropic Agents Example: Added an example for running Anthropic Managed Agents self-hosted sandboxes on GKE Agent Sandbox (#950).
  • Sandboxed Tools Enhancements: Improved sandboxed-tools examples to persist sessions and filesystem state across multiple tool calls and refactored tools into their own package (#888, #887, #877, #886).
  • MCP Server Example: Provided an example for running an MCP server inside a sandbox with persistent storage (#937).
  • AKS Kata Container Example: Added an AKS example demonstrating Kata Containers with sandbox warm pools (#839).
  • Ray Integration: Documented an example on how to run a RayJob with Agent Sandbox via direct PodIP (#868, #742).
  • Comprehensive Troubleshooting Guide: Added a detailed troubleshooting guide for debugging SDK, custom image, and cluster-level issues (#660).
  • API & NetworkPolicy Documentation: Updated documentation to reflect v1beta1 API changes, clarified NodeLocal DNS walkthrough, and expanded NetworkPolicy guidance (#867, #823, #815).
  • Issue Templates: Added structured GitHub issue templates for bug reports, feature requests, and epics, and improved their ordering (#880, #891).

Installation

Core & Extensions

# To install only the core components:
kubectl apply -f https://github.com/kubernetes-sigs/agent-sandbox/releases/download/v0.5.0/manifest.yaml

# To install the extensions components:
kubectl apply -f https://github.com/kubernetes-sigs/agent-sandbox/releases/download/v0.5.0/extensions.yaml

To upgrade from v0.4.6 to v0.5.0, please follow the detailed steps in [API Migration Guide...

Read more

v0.5.0rc1

v0.5.0rc1 Pre-release
Pre-release

Choose a tag to compare

@github-actions github-actions released this 08 Jun 22:47
6af1bbd

🚀 Announcing Agent Sandbox v0.5.0rc1!

We're excited to announce the release candidate of Agent Sandbox v0.5.0! This pre-release introduces major API advancements with the v1beta1 upgrade, enhanced warm pool management, critical security hardenings, and expanded developer tooling.

⚠️ Pre-Release Notice

This is a Release Candidate (RC) intended for early testing, validation, and feedback by maintainers and early adopters. It is not recommended for production environments.

Warning

Upgrading existing v1alpha1 API objects to v1beta1 is not yet supported (coming soon); users must install this version in a clean environment (no pre-existing v1alpha1 CRDs or CRs).

⚠️ Breaking Changes / Action Required

  • API Group Upgrade (v1beta1) (#867): The core and extension APIs have been upgraded from v1alpha1 to v1beta1. All example manifests and documentation now reflect v1beta1.
  • SandboxClaim Specification Overhaul (#899): The spec.templateRef field in SandboxClaim has been replaced with spec.warmpoolRef to better reflect warm pool architectural integration.
  • System-Reserved Metadata Protection (#894): System-reserved Pod labels and annotations are now protected from tenant overrides to prevent privilege escalation and sandbox hijacking.

Key Highlights

  • API Evolution & Stability

    • API Graduation to v1beta1: The core Agent Sandbox API has been graduated from v1alpha1 to v1beta1, marking a significant step towards maturity and stability. This involves dropping legacy alpha schemas and updating controllers for parity.
    • Sandbox Lifecycle Management: Replaced spec.replicas with a new spec.operatingMode field (supporting Running and Suspended) to provide more explicit and granular control over Sandbox suspension and resumption. This is a breaking change.
    • SandboxClaim API Refinement: The SandboxClaim API now uses a spec.warmPoolRef instead of spec.templateRef, simplifying how claims interact with warm pools and enhancing clarity. This is an action-required breaking change.
    • Granular Sandbox Suspend Condition: Introduced an explicit Suspended condition in the Sandbox status for more accurate tracking of sandbox states, supporting future features like process freezing.
    • Sandbox Template Ref Hash Propagation: The sandbox-template-ref-hash label is now consistently propagated to SandboxTemplate resources and adopted/cold-path Sandboxes, enabling easier client-side resolution of template-to-sandbox relationships.
    • Warm Pool Eviction: Implemented warm pool eviction using Cluster Autoscaler annotations, allowing idle, un-adopted Sandboxes to be marked as safe to evict.
    • Sandbox Name Annotation: The assigned Sandbox name is now stored in an annotation instead of a label to bypass Kubernetes' 63-character length constraint.
  • Security Enhancements

    • Sandbox Router Hardening: Addressed vulnerabilities related to unauthenticated internal proxying by enforcing strict sandbox_id validation, implementing optional Bearer token authentication, and tightening NetworkPolicy scoping to agent-sandbox-system namespace.
    • Pod Metadata Protection: Prevented tenants from overriding system-reserved Pod labels and annotations (agents.x-k8s.io/, extensions.agents.x-k8s.io/), mitigating potential traffic hijacking and spoofing.
    • Resource Hijacking Prevention: Introduced explicit label authorization (agents.x-k8s.io/adoptable: "true") before Sandboxes can adopt unowned Pods, Services, and PVCs. Previous owned objects can still be adopted without this label.
    • Python SDK Security: Disabled automatic HTTP redirects in SandboxConnector to prevent Server-Side Request Forgery (SSRF) attacks and sanitized OpenTelemetry trace attributes to prevent sensitive data exposure.
    • CI/Build Security: Fixed a Python module shadowing vulnerability in CI presubmits that could lead to Remote Code Execution (RCE) and added validation for KATA_VERSION to prevent path traversal.
    • IPv6 NetworkPolicy Hardening: The default NetworkPolicy now explicitly blocks IPv6 link-local traffic (fe80::/10), preventing untrusted code from accessing local services or cloud metadata endpoints.
    • Resourcectl PID Cleanup: Fixed a logic issue in resourcectl cleanup that could lead to arbitrary process termination due to stale heartbeat PIDs.
    • Analytics Tool Hardening: Patched a security vulnerability in the examples/analytics-tool allowing bypass of command execution allow-lists.
  • Performance & Scalability

    • Parallel Warm Pool Operations: Enabled parallel creation and deletion of sandboxes in the Warm Pool controller, significantly reducing reconciliation times (up to 4.26x faster).
    • Warm Pool Selection Optimization: Optimized the NodeSpread sandbox selection strategy to run purely in-memory, drastically reducing API server overhead and improving P99 concurrent claim latency by up to 4x.
    • Claim Status Update Optimization: Switched to patching for SandboxClaim status updates to reduce conflicts and improve scalability.
    • Memory Leak Reduction: Implemented measures to catch memory leaks and reduce per-scrape allocations across controllers and clients.
  • Python & Go SDK Improvements

    • Python SDK Client Enhancements: Added support for label selectors, hardened file upload path validation, enabled template-verified reattachment, and introduced shutdown_after_seconds for ephemeral sandboxes.
    • Python SDK Snapshot Restoration: Enabled restoration from dedicated snapshots, allowing sandboxes to be reverted to specific previous states.
    • Go SDK PodIP Routing: Implemented PodIP routing to fix connection issues with local sandbox-router gateways when cluster DNS is not available.
  • Enhanced Developer Experience & Tooling

    • Standardized GitHub Issue Templates: Added structured YAML templates for bug reports, feature requests, and maintainer epics, along with a config.yml for clearer contact links.
    • AI Code Review Integration: Configured CodeRabbit for automated PR summaries and walkthroughs, and optimized Copilot instructions to align with project toolchain, linting, and review scope policies.
    • Helm Chart Flexibility: Added podAnnotations, podLabels, podSecurityContext, and containerSecurityContext options to the controller Helm chart for greater customization and compliance with cluster security policies.
    • Build System Updates: Bumped Go versions across the repository and updated GitHub Actions dependencies. The PyPI publish process was also updated to allow release candidate versions.
  • Examples & Documentation

    • Sandboxed Tools Enhancements: Refactored tools into their own package, added functionality for persisting sessions across invocations, and enabled sandboxes to stay alive over multiple tool calls for faster execution.
    • New Example Workloads: Introduced a self-contained example for running an MCP server inside a sandbox with storage persistence, an AKS example using Kata Containers with sandbox warm pools, and a RayJob integration example.
    • Comprehensive Documentation Updates: All examples and documentation have been upgraded to reflect the v1beta1 API. New guides include detailed explanations of NetworkPolicy management, NodeLocal DNS with NetworkPolicy, and utilizing Dataplane-v2 for setup.

Installation

Core & Extensions

# To install only the core components:
kubectl apply -f https://github.com/kubernetes-sigs/agent-sandbox/releases/download/v0.5.0rc1/manifest.yaml

# To install the extensions components:
kubectl apply -f https://github.com/kubernetes-sigs/agent-sandbox/releases/download/v0.5.0rc1/extensions.yaml

Python SDK

pip install k8s-agent-sandbox==0.5.0rc1

Contributors

We extend our sincere thanks to all contributors to this release:
@aditya-shantanu, @AlexBulankou, @armistcxy, @arpitjain099, @chw120, @dependabot[bot], @hrsh1209, @ianchakeres, @janetkuo, @justinsb, @lauragalbraith, @moficodes, @mvanhorn, @patcrombie, @rainwoodman, @rmalani-nv, @ryanzhang-oss, @shaikenov, @shelwinnn, @SHRUTI6991, @shrutiyam-glitch, @tom1299, @tomergee, @vicentefb

👋 New Contributors

Read more

v0.4.6

Choose a tag to compare

@github-actions github-actions released this 14 May 22:49
d0c124d

🚀 Announcing Agent Sandbox v0.4.6!

We're excited to announce the release of Agent Sandbox v0.4.6! This release introduces major scalability enhancements through opt-in Service management, robust developer guidance with AI agent skills, expanded API and Network Policy documentation, and new stateful AI agent examples.

⚠️ Breaking Changes / Action Required

  • Service Creation Opt-In (#775, #800): The Sandbox controller no longer creates a headless Service by default for new Sandboxes. This architectural change significantly improves cluster scalability by eliminating kube-proxy and Kubernetes DNS overhead when scaling to thousands of pods. Existing Sandboxes with an auto-provisioned Service are preserved automatically.
    • Action Required: For new Sandboxes that require an auto-provisioned headless Service, explicitly set spec.service: true. To explicitly remove an existing Service, set spec.service: false.
    • New service field: Sandbox spec and SandboxTemplate spec now support the service boolean field to control the headless Service creation (default false). If omitted, existing services of Sandboxes will not be removed, to avoid disruption.
    • Python SDK & Router Integration: The Python SDK and sandbox-router have been updated to support direct Pod IP routing via the X-Sandbox-Pod-IP header, bypassing Service routing overhead. The SDK gracefully recovers from API server timeouts and disables Pod IP routing if permissions are lacking (falling back to Service routing).

Key Highlights

  • Core Stability and Lifecycle Management
    Fixed an issue where the sandbox name hash (selector label) was unavailable when a sandbox was scaled down to zero replicas during suspension (#754). status.labelselector is no longer unset when replicas is 0. If the hash cannot be resolved, suspension fails gracefully with a clear error reason. Added integration tests for suspend/resume on new client instances.

  • AI Agent Skills & Developer Guidelines
    Introduced specialized AI agent skills in .agents/skills/ (k8s-api-conventions and dev-rules) to guide AI coding assistants contributing to the repository (#766). Added AGENTS.md at the repo root covering project layout, build/test/lint flows, codegen rules, and GitHub Copilot/CLA guidelines (#707). Updated .github/copilot-instructions.md with Kubernetes API conventions and CLA reminders (#768).

  • Enhanced Documentation and Examples
    Added comprehensive core API documentation in docs/api.md (#247) and detailed Network Policy management documentation explaining the capabilities and limitations of networkPolicyManagement in SandboxTemplate (#743). Added a new example demonstrating how to deploy the Hermes Agent (hermes-agent.nousresearch.com) inside the Kubernetes Agent Sandbox with persistent storage (volumeClaimTemplates) and custom skill injection via ConfigMaps (#774). Updated the OpenClaw sandbox example to demonstrate usage with the gVisor runtime class on GKE for enhanced sandbox isolation (#475). Added a release automation guide and updated the PR template for release notes (#748, #790).

  • CI/CD and Release Automation
    Enabled an automated weekly release schedule (Thursdays at 9:00 AM UTC) using GitHub Actions workflows (#783). Migrated Gemini release note generation from static API keys to secure Vertex AI with short-lived Google Cloud IAM credentials (#783). Updated GitHub Actions dependencies (#788).

Installation

Core & Extensions

# To install only the core components:
kubectl apply -f https://github.com/kubernetes-sigs/agent-sandbox/releases/download/v0.4.6/manifest.yaml

# To install the extensions components:
kubectl apply -f https://github.com/kubernetes-sigs/agent-sandbox/releases/download/v0.4.6/extensions.yaml

Python SDK

pip install k8s-agent-sandbox==0.4.6

Contributors

We extend our sincere thanks to all contributors to this release:
@aleks-stefanovic, @dependabot[bot], @drogovozDP, @fedebongio, @flpanbin, @janetkuo, @shrutiyam-glitch, @vicentefb, @volatilemolotov

👋 New Contributors

@fedebongio made their first contribution in #774

Full Changelog: v0.4.5...v0.4.6

v0.4.5

Choose a tag to compare

@github-actions github-actions released this 06 May 21:13
41da075

🚀 Announcing Agent Sandbox v0.4.5!

We're excited to announce the release of Agent Sandbox v0.4.5! This release brings significant improvements across release automation, Python SDK capabilities, core stability, and extensive documentation, making Agent Sandbox more robust and user-friendly.

⚠️ Breaking Changes

  • Python SDK Update: Upgraded GKE PodSnapshot API from v1alpha1 to v1, which requires adding the agents.x-k8s.io/sandbox-name-hash label to your PodSnapshotPolicy grouping rules. Removed support for restoring sandboxes by creating new claims from previous snapshot templates.

Key Highlights

  • CI/CD and Release Automation
    A major overhaul of the release workflow introduces fully automated tagging (including graceful handling of release candidates and transitions to stable versions), enhanced release note generation with AI-powered summaries and accurate contributor listing, and robust image promotion to the k8s.io registry with PR polling. Workflow permissions have been refined, and all GitHub Actions dependencies updated for better reliability.
  • Python SDK Improvements
    The Python SDK now supports the stable v1 version of the PodSnapshot API, ensuring better compatibility and introducing sandboxNameHash for snapshot grouping. A new Prometheus metric, sandbox_client_discovery_latency_ms, has been added to monitor client connection latency across different connection strategies. The SDK client now correctly accepts warmpool parameters for sandbox claim creation, resolving cross-namespace adoption issues, and the sandbox router efficiently streams large request bodies instead of buffering them.
  • Enhanced Documentation and Examples
    Documentation has been significantly expanded with new guides for volumeClaimTemplates and a quickstart for the Golang client. New examples showcase dynamic scaling of SandboxWarmPool with Horizontal Pod Autoscaler (HPA) and integration with Kueue for admission control and quota management (updated to v1beta2 API). Documentation pages have been reordered and cleaned up for improved navigation, and a new PR template has been added to streamline contributions.
  • Core Stability and Benchmarking
    Memory leaks in the extensions/controllers package (including SandboxClaimReconciler and SimpleSandboxQueue) have been identified and fixed through the integration of uber.org/goleak for robust goroutine leak detection, enhancing long-term stability. Benchmarking capabilities are improved with CSV output for easier analysis and better Boskos resource tracking ensuring consistent GCR.io image pushes.
  • Policy and Security Examples
    New Kyverno policy examples have been added and hardened to prevent RBAC privilege escalation for Sandbox workloads, improving the security posture of your deployments.

Installation

Core & Extensions

# To install only the core components:
kubectl apply -f https://github.com/kubernetes-sigs/agent-sandbox/releases/download/v0.4.5/manifest.yaml

# To install the extensions components:
kubectl apply -f https://github.com/kubernetes-sigs/agent-sandbox/releases/download/v0.4.5/extensions.yaml

Python SDK

pip install k8s-agent-sandbox==0.4.5

Contributors

We extend our sincere thanks to all contributors to this release:
@ArthurKamalov, @CodesbyUnnati, @alimx07, @dependabot, @chw120, @drogovozDP, @janetkuo, @justinsb, @moficodes, @pandaji, @realshuting, @shrutiyam-glitch, @sohanpatil, @vicentefb, @volatilemolotov

👋 New Contributors

@CodesbyUnnati made their first contribution in #684
@pandaji made their first contribution in #618
@sohanpatil made their first contribution in #715
@realshuting made their first contribution in #682

Full Changelog: v0.4.3...v0.4.5

v0.4.3

Choose a tag to compare

@janetkuo janetkuo released this 28 Apr 21:32
af42928

🚀 Announcing Agent Sandbox v0.4.3!

We are excited to announce the release of Agent Sandbox v0.4.3!

This release expands documentation significantly, with new guides covering filesystems, volumes, lifecycle, snapshots, metrics, custom environments, and a Python SDK quickstart — plus a reorganized Use Cases section and Go sandbox client docs. It introduces new lifecycle APIs, including a Finished condition on Sandbox and SandboxClaim and a ttlSecondsAfterFinished field for automatic cleanup of finished claims, alongside volumeClaimTemplates support for persistent storage. Warm pool correctness improves with fixes for duplicate Sandbox adoption during informer cache lag, retain-policy deletion, and cross-namespace adoption protection. Observability gains a --version flag, an agent_sandbox_build_info Prometheus metric, and more accurate startup-latency tracking. Finally, the Python SDK adds a new SandboxInClusterConnectionConfig so in-cluster clients can bypass the router — using stable cluster DNS by default, with an opt-in low-latency pod-IP mode.

Key Highlights

  • Documentation Expansion: Comprehensive new docs covering filesystems, volumes, lifecycle/TTL, snapshots, metrics, custom environments, and use cases. Added a Python SDK quickstart, Go sandbox client docs, and a homepage Use Cases grid. The docs/examples/ section has been reorganized under docs/use-cases/.
  • Lifecycle & Cleanup: New Finished condition on Sandbox (mirrored to SandboxClaim) reporting PodSucceeded / PodFailed. New ttlSecondsAfterFinished field on SandboxClaim.spec.lifecycle for automatic cleanup of finished claims, honoring the existing shutdownPolicy (Retain, Delete, DeleteForeground).
  • Storage: Added volumeClaimTemplates support to SandboxTemplate, propagated through SandboxClaim and SandboxWarmPool to the underlying Sandbox. PVC-backed volumes use StatefulSet-style merge semantics with the pod template.
  • Warm Pool Correctness: Fixed duplicate Sandbox adoption during informer cache lag by recording the adopted Sandbox name on the claim via the agents.x-k8s.io/sandbox-name label. Fixed warm-pool sandbox deletion when shutdownPolicy: Retain is set. Added cross-namespace adoption protection.
  • Python SDK Enhancements: New SandboxInClusterConnectionConfig for in-cluster clients to bypass the router — defaults to stable cluster DNS, with an opt-in use_pod_ip=True mode for low-latency direct pod connections (with cache invalidation on errors). Sandbox.status.podIPs is now exposed end-to-end.
  • Observability: Added a --version flag and agent_sandbox_build_info Prometheus metric (with git version, SHA, build date, Go version, platform). Improved ControllerStartupLatency accuracy by keying observed-time entries on both name and UID, so recreated claims with the same name no longer reuse stale timestamps.
  • Stability: SandboxClaim now requeues quietly (instead of erroring) when its template is missing and recovers automatically when the template is created. Sandbox controller now safely handles AlreadyExists on pod creation, refusing to adopt pods owned by other controllers.
  • Testing: Go unit tests now run with -race enabled by default; new make test-e2e-race target. Added a kOps-on-GCP benchmark scenario and a resourcectl CLI for Boskos resource management.

Installation

Core & Extensions

# To install only the core components:
kubectl apply -f https://github.com/kubernetes-sigs/agent-sandbox/releases/download/v0.4.3/manifest.yaml

# To install the extensions components:
kubectl apply -f https://github.com/kubernetes-sigs/agent-sandbox/releases/download/v0.4.3/extensions.yaml

Python SDK

pip install k8s-agent-sandbox==0.4.3

Contributors

A huge thank you to all the contributors who made this release possible!

@angristan, @noeljackson, @vgunapati, @igooch, @chw120, @yashasvimisra2798, @bittermandel, @Oneimu, @alimx07, @aleks-stefanovic, @justinsb, @janetkuo, @dongjiang1989, @rayowang, @vicentefb, @volatilemolotov

👋 New Contributors

@vgunapati made their first contribution in #489
@alimx07 made their first contribution in #645
@bittermandel made their first contribution in #583
@angristan made their first contribution in #240

Full Changelog: v0.4.2...v0.4.3

v0.4.2

Choose a tag to compare

@janetkuo janetkuo released this 22 Apr 18:05
a0b466a

🚀 Announcing Agent Sandbox v0.4.2!

We are excited to announce the release of Agent Sandbox v0.4.2!

This release introduces significant enhancements to the Python SDK with asynchronous operations, improves controller stability through optimized warm pool management, and strengthens observability with precision latency tracking and debug endpoints. Additionally, the Python SDK expands lifecycle management capabilities with native support for suspend and resume using GKE Pod snapshots, alongside a wealth of new documentation and examples to accelerate AI agent development.

Key Highlights

  • Python SDK Advancements: Introduced a new asynchronous Python client to enable non-blocking sandbox operations. The SDK now also supports specifying spec.lifecycle in SandboxClaim at creation time and provides new methods for status retrieval and environment variable injection.
  • Enhanced Lifecycle & Snapshots: Implemented suspend and resume functionality in Python SDK leveraging GKE Pod snapshots, allowing for stateful sandbox persistence. Users can also configure auto-cleanup based on boolean flags for better resource management.
  • Core Stability & Warm Pool Optimization: The SandboxWarmPool now automatically recreates sandboxes upon template updates. Retrieval performance is significantly improved with a new queue-based mechanism for warm sandbox adoption.
  • Precision Observability & Debugging: Improved the precision of controller startup latency metrics using event predicates. A new fgprof debug endpoint has been added for advanced Off-CPU time analysis to assist in performance tuning.
  • Expanded Documentation & Examples: Launched a new quickstart example and a dedicated examples section on the website. Architecture diagrams have been updated to highlight extension points, and new guides for OTel Collector deployment are now available.

Installation

Core & Extensions

# To install only the core components:
kubectl apply -f https://github.com/kubernetes-sigs/agent-sandbox/releases/download/v0.4.2/manifest.yaml

# To install the extensions components:
kubectl apply -f https://github.com/kubernetes-sigs/agent-sandbox/releases/download/v0.4.2/extensions.yaml

Python SDK

pip install k8s-agent-sandbox==0.4.2

Contributors

A huge thank you to all the contributors who made this release possible!

@AndyJCai, @igooch, @rayowang, @xiaoj655, @vicentefb, @shrutiyam-glitch, @SHRUTI6991, @aleks-stefanovic, @dongjiang1989, @aditya-shantanu, @brandonroyal, @volatilemolotov, @Iceber, @framsouza, @Pepper-rice, @chw120, @justinsb, @janetkuo, @kincoy, @dependabot, @Oneimu, @wllbo

New Contributors

Full Changelog: v0.3.10...v0.4.2

v0.3.10

Choose a tag to compare

@justinsb justinsb released this 08 Apr 17:22
4eceb03

⚠️ Breaking Changes

  • SandboxWarmPool now creates Sandbox CRs instead of bare Pods. Existing warm pool pods from previous versions will be orphaned and should be manually deleted (kubectl delete pods --all-namespaces -l "agents.x-k8s.io/pool,!agents.x-k8s.io/sandbox-name-hash"). This also means the Sandbox created from the SandboxWarmPool will have a random suffix in its name, instead of matching SandboxClaim.
  • SandboxClaim status.Name is renamed to status.name to follow Kubernetes naming conventions.

What's Changed

New Contributors

Read more

v0.2.1

Choose a tag to compare

@janetkuo janetkuo released this 14 Mar 00:17
17c33dd

🚀 Announcing Agent Sandbox v0.2.1!

We are excited to announce the release of Agent Sandbox v0.2.1!

This release introduces a major shift to a "Secure by Default" networking architecture, enforcing strict isolation for AI agents while providing a highly scalable shared policy model. Alongside these security and architectural advancements, this version strengthens observability with new telemetry metrics, enhances controller stability through a migration to the Deployment model, and expands the Python SDK capabilities with Pod Snapshots and native Kubernetes client support.

⚠️ Breaking Changes

  • Controller Migration (StatefulSet to Deployment): The core controller has been migrated from a StatefulSet to a Deployment, and leader election is now enabled by default. Action Required: You must delete the existing StatefulSet before deploying the new version to avoid conflicts by running kubectl delete statefulset agent-sandbox-controller -n agent-sandbox-system (#191).
  • Metrics Service Port Update: The metrics Service port has been changed from 80 to 8080 to align with standard practices and avoid traffic conflicts. Action Required: Update any custom ServiceMonitor resources or Prometheus scraping configurations to target port 8080 (#366).
  • Secure-by-Default Network Isolation: SandboxTemplates that do not explicitly define a network policy now default to a strict isolation posture. This blocks access to internal cluster IPs, VPC subnets, and the node metadata server. Action Required: If your agents require access to internal services, you must explicitly define these rules in your SandboxTemplate or opt out by setting the SandboxTemplate's spec.networkPolicyManagement field to Unmanaged (#287).

Key Highlights

  • Secure by Default Networking & Scalability: Implemented a strict security baseline for all sandboxes. If no policy is specified, the controller automatically blocks access to internal cluster IPs, VPC subnets, and the node metadata server. To ensure scalability, a single shared NetworkPolicy is now managed per SandboxTemplate rather than per individual sandbox, enabling instant fleet-wide updates with minimal API overhead.
  • Multi-Language SDK Advancements:
    • Typed Go Client: Introduced a native Kubernetes Go client generated via client-gen, allowing Go developers to interact with Agent Sandbox resources using standard, type-safe Kubernetes patterns.
    • Python SDK Advancements: Added support for GKE Pod Snapshots, enabling users to capture the state of running sandboxes. The SDK now features native Kubernetes client generation and new file management methods (list and exists).
  • Improved Observability & Metrics: Introduced new metrics to track sandbox lifecycles, including agent_sandbox_claim_startup_latency_ms and agent_sandbox_claim_creation_total. Metrics and healthz container ports are now explicitly defined for better networking transparency.
  • Controller Stability & Scaling: The core controller has been migrated from a StatefulSet to a Deployment for better lifecycle management. It now supports controller concurrency, configurable router timeouts, and enhanced leader election settings.
  • Robust Testing Infrastructure: The test suite now uses a watch-based mechanism instead of polling for more accurate results and captures detailed logs (including kubelet and containerd) into artifacts for easier debugging. A new load test using clusterloader2 has been added to simulate high-density sandbox environments.

Installation

Core & Extensions

# To install only the core components:
kubectl apply -f https://github.com/kubernetes-sigs/agent-sandbox/releases/download/v0.2.1/manifest.yaml

# To install the extensions components:
kubectl apply -f https://github.com/kubernetes-sigs/agent-sandbox/releases/download/v0.2.1/extensions.yaml

Python SDK

pip install k8s-agent-sandbox==0.2.1

Contributors

A huge thank you to all the contributors who made this release possible!

@antonipp, @mastersingh24, @SHRUTI6991, @igooch, @shrutiyam-glitch, @jkallogjeri, @justinsb, @runzhliu, @janetkuo, @vicentefb, @acsoto, @Oneimu, @sabre1041, @e-minguez, @Aliexe-code, @tp953704, @aditya-shantanu, @dongjiang1989, @tomergee, @shreyas-badiger, @esposem, @yongruilin

👋 New Contributors

Full Changelog: v0.1.1...v0.2.1

v0.1.1

Choose a tag to compare

@janetkuo janetkuo released this 04 Feb 21:32
c179034

🚀 Announcing Agent Sandbox v0.1.1!

We are excited to announce the release of Agent Sandbox v0.1.1!

This release brings significant improvements to documentation, observability, extensibility, and stability, along with several new examples to help you get started.

Key Highlights

  • New Documentation Site: We have launched a dedicated https://agent-sandbox.sigs.k8s.io/ site to make it easier to find guides and references.
  • OpenTelemetry Support: Added optional OpenTelemetry tracing to both the Python client and the Controllers, improving observability for your agentic workloads.
  • Enhanced Capabilities:
    • Shutdown Policy: Support for configurable Sandbox/SandboxClaim shutdown policies and shutdown times.
    • Extensions: Better management for extension deployments, including automount and NetworkPolicy support.
  • Critical Fixes & Stability:
    • gVisor Support in Python SDK: Major Python client refactor enabling full gVisor (runsc) compatibility.
    • WarmPool Reliability: Fixed pod adoption logic, metadata propagation, and prioritization of "Ready" pods.
    • Lifecycle Management: Resolved repeated expiry cleanup loops.
  • New Examples: Explore new examples including Gemini Computer Use, ADK Agent, and a Moltbot example.

Installation

# To install only the core components:
kubectl apply -f https://github.com/kubernetes-sigs/agent-sandbox/releases/download/v0.1.1/manifest.yaml

# To install the extensions components:
kubectl apply -f https://github.com/kubernetes-sigs/agent-sandbox/releases/download/v0.1.1/extensions.yaml

Contributors

A huge thank you to all the contributors who made this release possible!

@janetkuo, @volatilemolotov, @igooch, @antonipp, @mlgarchery, @shrutiyam-glitch, @lizzzcai, @barney-s, @sdowell, @vicentefb, @Iceber, @acsoto, @ArthurKamalov, @tomergee, @peterzhongyi, @hzxuzhonghu, @aditya-shantanu, @SHRUTI6991, @alex-akv, @bilalshaikh42, @justinsb

👋 New Contributors

Full Changelog: v0.1.0...v0.1.1

v0.1.0

Choose a tag to compare

@janetkuo janetkuo released this 07 Nov 23:16
ba768e6

🚀 Announcing Agent Sandbox v0.1.0!

We are thrilled to announce the first official release of Agent Sandbox, v0.1.0!

This release marks a major milestone, providing a powerful and flexible platform for managing isolated, stateful, singleton workloads in Kubernetes, ideal for use cases like AI agent runtimes. With v0.1.0, you can:

  • Define and manage sandboxes declaratively using the new Sandbox, SandboxTemplate, and SandboxClaim APIs.
  • Run a variety of workloads in isolated environments, as demonstrated by our examples.
  • Improve performance with SandboxWarmPool, allowing for faster sandbox creation.

This release is the culmination of the hard work of our contributors, and we're excited to see what you build with it!

Installation

# To install only the core components:
kubectl apply -f https://github.com/kubernetes-sigs/agent-sandbox/releases/download/v0.1.0/manifest.yaml

# To install the extensions components:
kubectl apply -f https://github.com/kubernetes-sigs/agent-sandbox/releases/download/v0.1.0/extensions.yaml

Contributors

A huge thank you to all the contributors who made this release possible!

@janetkuo, @barney-s, @justinsb, @ameukam, @sdowell, @vicentefb, @tomergee, @flpanbin, @peterzhongyi, @YaoZengzeng, and @volatilemolotov.

Full Changelog: https://github.com/kubernetes-sigs/agent-sandbox/commits/v0.1.0