fix: protect system-reserved Pod labels and annotations from tenant override by tomergee · Pull Request #894 · kubernetes-sigs/agent-sandbox

tomergee · 2026-05-29T22:42:23Z

Summary

A Sandbox's spec.podTemplate metadata was propagated verbatim to the backing Pod, including system-reserved keys. A tenant could set agents.x-k8s.io/sandbox-name-hash in their template to match another Sandbox's headless Service selector and hijack its traffic, or forge system-prefixed labels / controller-managed annotations.

This PR makes the core controller treat any agents.x-k8s.io/ or extensions.agents.x-k8s.io/ label/annotation key (plus the trace-context annotation) as system-reserved and filter it out of user-supplied PodTemplate metadata on both the create (reconcilePod) and adoption (updatePodMetadata) paths:

The sandbox-name-hash label is assigned after merging user labels, so it can never be overridden.
Stale system-reserved keys that an older controller recorded in the propagated-labels / propagated-annotations lists are scrubbed on adoption/update.
System labels on Sandbox.metadata.labels are not copied to Pods — extension controllers own their own Sandbox CR lifecycle.

Adds security regression tests and documents the threat in docs/security/threat_model.md.

Relationship to #784: #784 (merged) requires agents.x-k8s.io/adoptable: "true" before adopting unowned Pod/Service/PVC resources. This PR closes the complementary gap where tenants could still inject system-reserved keys through spec.podTemplate. The core controller intentionally does not add extension ownerRef or warm-pool tracking logic.

Scope note: FNV-1a name hash is unchanged; SHA-256 strengthening remains a follow-up PR.

Test plan

go build ./...
go vet ./controllers/...
go test ./controllers/...
CI presubmits

linux-foundation-easycla · 2026-05-29T22:42:30Z

The committers listed above are authorized under a signed CLA.

✅ login: tomergee / name: Tomer Glottmann (a926842)

Copilot

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds defenses against tenant-controlled label/annotation injection on sandbox-managed Pods, preventing cross-tenant traffic hijacking and spoofing of controller tracking metadata.

Changes:

Documented the reserved-label/annotation injection threat and mitigations.
Filtered system-reserved labels/annotations from user PodTemplate metadata and ensured the service-selector label is controller-owned.
Expanded controller tests to avoid hard-coded hashes and to cover reserved-key dropping behavior.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.

File	Description
docs/security/threat_model.md	New threat model doc describing reserved-key injection risks and mitigations.
controllers/sandbox_controller.go	Implements system-reserved label/annotation filtering, adoption cleanup, and conditional propagation of trusted tracking labels.
controllers/sandbox_controller_test.go	Updates expected hashes and adds a test for reserved label/annotation dropping.

netlify · 2026-05-29T22:45:14Z

✅ Deploy Preview for agent-sandbox canceled.

Name	Link
🔨 Latest commit	`06b0bd4`
🔍 Latest deploy log	https://app.netlify.com/projects/agent-sandbox/deploys/6a1a70a16eaf9f0008111afe

tomergee · 2026-05-29T23:11:19Z

Thanks for the review! Addressed in the latest commit:

Stale system labels/annotations not scrubbed (cleanup loops): during adoption/update we now scrub system-reserved keys recorded in the propagated-labels / propagated-annotations lists by an older controller, keeping only the controller-owned name-hash label, the allowed tracking labels on extension-managed Sandboxes, and controller-managed annotations (propagated-labels, propagated-annotations, trace-context). I scoped this to the propagated lists rather than a blanket scrub of all Pod labels, to preserve the existing contract (covered by reconcilePod deletes label and annotation removed from sandbox) that updatePodMetadata only manages keys it propagated.
Weak ownerRef trust signal: the gate now requires a controller owner reference (controller=true) under the extensions API group, and I documented the OwnerReferencesPermissionEnforcement assumption near the helper. Verifying the owner object's existence/UID would harden it further and can be a follow-up.
Duplicated tracking-label list: centralized into a single package-level warmPoolTrackingLabels used by both the create and adoption paths (and isAllowedSystemLabel).
threat_model.md overclaim: reworded to describe the actual scrub behavior precisely.

Also added an adoption regression test for the stale-key scrub, and fixed the gofmt issues behind the failing lint-go / autogen checks.

Copilot

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

tomergee · 2026-05-29T23:26:10Z

Thanks for the follow-up review. Addressed both comments in ae8db28:

Tracking labels now kept in sync (not just added): updatePodMetadata iterates warmPoolTrackingLabels unconditionally. For controller-managed Sandboxes it sets/updates each label from the Sandbox CR and deletes it when the Sandbox no longer carries it; for non-extension-managed Sandboxes it removes any of these keys from the Pod. This prevents stale or spoofed tracking labels from persisting across ownership/label transitions.
Added gating tests: new TestReconcilePod cases verify (1) an extension-owned Sandbox propagates the tracking labels to its Pod, (2) a tenant-owned Sandbox with the same labels does not, and (3) a stale tracking label is scrubbed on adoption when the Sandbox is not extension-managed.

gofmt, go vet, and the full ./controllers/... suite all pass locally.

k8s-ci-robot · 2026-05-29T23:36:44Z

@tomergee: The label(s) priority/critical=urgent cannot be applied, because the repository doesn't have them.

Details

In response to this:

/priority critical=urgent

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

tomergee · 2026-05-29T23:37:09Z

/priority critical-urgent

…ations A Sandbox's spec.podTemplate metadata was propagated verbatim to the backing Pod, including system-reserved keys. A tenant could set agents.x-k8s.io/sandbox-name-hash in their template to match another Sandbox's headless Service selector and hijack its traffic, or forge warm-pool tracking labels and controller-managed annotations. The controller now treats any agents.x-k8s.io/ or extensions.agents.x-k8s.io/ label/annotation key (plus the trace-context annotation) as system-reserved and filters it out of user-supplied PodTemplate metadata on both the create (reconcilePod) and adoption (updatePodMetadata) paths: - The sandbox-name-hash label is assigned after merging user labels so it can never be overridden. - Unauthorized system labels already present on an adopted Pod are removed. - Warm-pool tracking labels are only propagated for Sandboxes owned by a trusted extension controller (SandboxWarmPool/SandboxClaim), preventing spoofing via a directly-created Sandbox. Adds a security regression test, refactors TestReconcile to compute the name hash dynamically instead of a hardcoded value, and documents the threat in docs/security/threat_model.md. The FNV-1a name hash is unchanged.

- Centralize the warm-pool tracking-label key list shared by the create and adoption paths to avoid drift. - Require a controller owner reference (controller=true) under the extensions API group before propagating warm-pool tracking labels, and document the OwnerReferencesPermissionEnforcement assumption near the helper. - During adoption/update, scrub system-reserved keys that an older (vulnerable) controller may have recorded in the propagated-labels/propagated-annotations lists, keeping the controller-owned name-hash label, the allowed tracking labels on extension-managed Sandboxes, and controller-managed annotations. - Make docs/security/threat_model.md precise about the scrub behavior. - Add an adoption regression test for the stale-key scrub.

Address review feedback on warm-pool tracking label propagation: - updatePodMetadata now keeps the warm-pool tracking labels in sync instead of only ever adding them. For controller-managed Sandboxes the labels are set/updated from the Sandbox CR; otherwise (or when the Sandbox drops a tracking label) they are removed from the Pod. This prevents stale or spoofed tracking labels from persisting across ownership/label transitions. - Add tests covering the gating logic: an extension-managed Sandbox propagates the tracking labels to its Pod, a tenant-owned Sandbox with the same labels does not, and a stale tracking label is scrubbed on adoption when the Sandbox is not extension-managed.

Copilot

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

- Downgrade scrub logging to V(1) to avoid flooding cluster logs during large rollouts/adoptions. - Replace the mutable warmPoolTrackingLabels slice with an immutable warmPoolTrackingLabelSet map for membership checks and iteration. - Parse ownerRef APIVersion with schema.ParseGroupVersion and compare the extensions group exactly; centralize allowed owner kinds in a map.

Per review: the core Sandbox controller should not encode warm-pool/claim ownerRef logic or propagate tracking labels from Sandbox.metadata.labels. Keep the security fix only: - filter system-reserved keys from spec.podTemplate on create and update - assign sandbox-name-hash after merging user labels - scrub stale propagated system keys on adoption (except name-hash and controller-managed annotations) Remove extension-specific helpers, sync logic, and gating tests. Add a test that system labels on Sandbox.metadata are not copied to Pods. Update the threat model accordingly.

Copilot

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

aditya-shantanu

/lgtm

barney-s

Thank you for simplifying the code and removing coupling b/w core and extension controllers.
+1 to removing reserved keys from existing pod labels and annotations.

barney-s · 2026-06-02T06:42:25Z

 	var managedLabelKeys []string
 	for k, v := range sandbox.Spec.PodTemplate.ObjectMeta.Labels {
+		// Never let a user-supplied template set system-reserved labels.
+		if isSystemLabel(k) {


Optimization: Consider pre-allocating the managedLabelKeys slice capacity to avoid overhead from dynamic re-allocations as elements are appended inside the loop:

barney-s · 2026-06-02T07:18:21Z

/lgtm
/approve

minor comment left. don't want to hold the PR for that.

k8s-ci-robot · 2026-06-02T07:18:29Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: barney-s, tomergee

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [barney-s]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

…verride (kubernetes-sigs#894) * fix: prevent tenants from overriding system-reserved Pod labels/annotations A Sandbox's spec.podTemplate metadata was propagated verbatim to the backing Pod, including system-reserved keys. A tenant could set agents.x-k8s.io/sandbox-name-hash in their template to match another Sandbox's headless Service selector and hijack its traffic, or forge warm-pool tracking labels and controller-managed annotations. The controller now treats any agents.x-k8s.io/ or extensions.agents.x-k8s.io/ label/annotation key (plus the trace-context annotation) as system-reserved and filters it out of user-supplied PodTemplate metadata on both the create (reconcilePod) and adoption (updatePodMetadata) paths: - The sandbox-name-hash label is assigned after merging user labels so it can never be overridden. - Unauthorized system labels already present on an adopted Pod are removed. - Warm-pool tracking labels are only propagated for Sandboxes owned by a trusted extension controller (SandboxWarmPool/SandboxClaim), preventing spoofing via a directly-created Sandbox. Adds a security regression test, refactors TestReconcile to compute the name hash dynamically instead of a hardcoded value, and documents the threat in docs/security/threat_model.md. The FNV-1a name hash is unchanged. * fix: address review feedback on system-label protection - Centralize the warm-pool tracking-label key list shared by the create and adoption paths to avoid drift. - Require a controller owner reference (controller=true) under the extensions API group before propagating warm-pool tracking labels, and document the OwnerReferencesPermissionEnforcement assumption near the helper. - During adoption/update, scrub system-reserved keys that an older (vulnerable) controller may have recorded in the propagated-labels/propagated-annotations lists, keeping the controller-owned name-hash label, the allowed tracking labels on extension-managed Sandboxes, and controller-managed annotations. - Make docs/security/threat_model.md precise about the scrub behavior. - Add an adoption regression test for the stale-key scrub. * Sync warm-pool tracking labels and add gating tests Address review feedback on warm-pool tracking label propagation: - updatePodMetadata now keeps the warm-pool tracking labels in sync instead of only ever adding them. For controller-managed Sandboxes the labels are set/updated from the Sandbox CR; otherwise (or when the Sandbox drops a tracking label) they are removed from the Pod. This prevents stale or spoofed tracking labels from persisting across ownership/label transitions. - Add tests covering the gating logic: an extension-managed Sandbox propagates the tracking labels to its Pod, a tenant-owned Sandbox with the same labels does not, and a stale tracking label is scrubbed on adoption when the Sandbox is not extension-managed. * Address latest Copilot review on label protection - Downgrade scrub logging to V(1) to avoid flooding cluster logs during large rollouts/adoptions. - Replace the mutable warmPoolTrackingLabels slice with an immutable warmPoolTrackingLabelSet map for membership checks and iteration. - Parse ownerRef APIVersion with schema.ParseGroupVersion and compare the extensions group exactly; centralize allowed owner kinds in a map. * Simplify label protection: drop extension coupling Per review: the core Sandbox controller should not encode warm-pool/claim ownerRef logic or propagate tracking labels from Sandbox.metadata.labels. Keep the security fix only: - filter system-reserved keys from spec.podTemplate on create and update - assign sandbox-name-hash after merging user labels - scrub stale propagated system keys on adoption (except name-hash and controller-managed annotations) Remove extension-specific helpers, sync logic, and gating tests. Add a test that system labels on Sandbox.metadata are not copied to Pods. Update the threat model accordingly. * Address review: centralize prefix check, scrub spoofed trace-context

Copilot AI review requested due to automatic review settings May 29, 2026 22:42

github-project-automation Bot added this to Agent Sandbox May 29, 2026

github-project-automation Bot moved this to Backlog in Agent Sandbox May 29, 2026

k8s-ci-robot requested review from aditya-shantanu and janetkuo May 29, 2026 22:42

k8s-ci-robot added cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels May 29, 2026

Copilot AI reviewed May 29, 2026

View reviewed changes

Comment thread controllers/sandbox_controller.go Outdated

Comment thread controllers/sandbox_controller.go

Comment thread controllers/sandbox_controller.go Outdated

Comment thread controllers/sandbox_controller.go Outdated

Comment thread docs/security/threat_model.md Outdated

tomergee force-pushed the pr-793-label-protection branch from 478b226 to a926842 Compare May 29, 2026 22:51

k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels May 29, 2026

janetkuo added the action-required: resolve-copilot-comments label May 29, 2026

Copilot AI review requested due to automatic review settings May 29, 2026 23:10

Copilot AI reviewed May 29, 2026

View reviewed changes

Comment thread controllers/sandbox_controller.go Outdated

Comment thread controllers/sandbox_controller.go Outdated

k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels May 29, 2026

k8s-ci-robot added the priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. label May 29, 2026

janetkuo added ready-for-review and removed action-required: resolve-copilot-comments labels May 30, 2026

tomergee added 3 commits May 29, 2026 17:06

Copilot AI review requested due to automatic review settings May 30, 2026 00:09

tomergee force-pushed the pr-793-label-protection branch from ae8db28 to 74273a8 Compare May 30, 2026 00:09

Copilot AI reviewed May 30, 2026

View reviewed changes

Comment thread controllers/sandbox_controller.go

Comment thread controllers/sandbox_controller.go Outdated

Comment thread controllers/sandbox_controller.go Outdated

tomergee added 2 commits May 29, 2026 17:47

Copilot AI review requested due to automatic review settings May 30, 2026 04:49

k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels May 30, 2026

Copilot AI reviewed May 30, 2026

View reviewed changes

Comment thread controllers/sandbox_controller.go

Comment thread controllers/sandbox_controller.go

Comment thread controllers/sandbox_controller_test.go

Address review: centralize prefix check, scrub spoofed trace-context

06b0bd4

aditya-shantanu reviewed Jun 1, 2026

View reviewed changes

k8s-ci-robot assigned aditya-shantanu Jun 1, 2026

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 1, 2026

igooch reviewed Jun 1, 2026

View reviewed changes

Comment thread docs/security/threat_model.md

barney-s approved these changes Jun 2, 2026

View reviewed changes

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 2, 2026

k8s-ci-robot assigned barney-s Jun 2, 2026

k8s-ci-robot merged commit fe75295 into kubernetes-sigs:main Jun 2, 2026
11 checks passed

github-project-automation Bot moved this from Backlog to Done in Agent Sandbox Jun 2, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: protect system-reserved Pod labels and annotations from tenant override#894

fix: protect system-reserved Pod labels and annotations from tenant override#894
k8s-ci-robot merged 6 commits into
kubernetes-sigs:mainfrom
tomergee:pr-793-label-protection

tomergee commented May 29, 2026 •

edited

Loading

linux-foundation-easycla Bot commented May 29, 2026 •

edited

Loading

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

netlify Bot commented May 29, 2026 •

edited

Loading

tomergee commented May 29, 2026

Copilot AI left a comment

Uh oh!

Uh oh!

tomergee commented May 29, 2026

k8s-ci-robot commented May 29, 2026

tomergee commented May 29, 2026

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

aditya-shantanu left a comment

Uh oh!

barney-s left a comment

barney-s Jun 2, 2026

barney-s commented Jun 2, 2026

k8s-ci-robot commented Jun 2, 2026

Uh oh!

Labels

7 participants

Uh oh!

Conversation

tomergee commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

linux-foundation-easycla Bot commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

netlify Bot commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for agent-sandbox canceled.

tomergee commented May 29, 2026

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

tomergee commented May 29, 2026

k8s-ci-robot commented May 29, 2026

tomergee commented May 29, 2026

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

aditya-shantanu left a comment

Choose a reason for hiding this comment

Uh oh!

barney-s left a comment

Choose a reason for hiding this comment

barney-s Jun 2, 2026

Choose a reason for hiding this comment

barney-s commented Jun 2, 2026

k8s-ci-robot commented Jun 2, 2026

Uh oh!

Labels

7 participants

tomergee commented May 29, 2026 •

edited

Loading

linux-foundation-easycla Bot commented May 29, 2026 •

edited

Loading

netlify Bot commented May 29, 2026 •

edited

Loading