fix: protect system-reserved Pod labels and annotations from tenant override#894
Conversation
|
|
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Adds defenses against tenant-controlled label/annotation injection on sandbox-managed Pods, preventing cross-tenant traffic hijacking and spoofing of controller tracking metadata.
Changes:
- Documented the reserved-label/annotation injection threat and mitigations.
- Filtered system-reserved labels/annotations from user
PodTemplatemetadata and ensured the service-selector label is controller-owned. - Expanded controller tests to avoid hard-coded hashes and to cover reserved-key dropping behavior.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| docs/security/threat_model.md | New threat model doc describing reserved-key injection risks and mitigations. |
| controllers/sandbox_controller.go | Implements system-reserved label/annotation filtering, adoption cleanup, and conditional propagation of trusted tracking labels. |
| controllers/sandbox_controller_test.go | Updates expected hashes and adds a test for reserved label/annotation dropping. |
✅ Deploy Preview for agent-sandbox canceled.
|
478b226 to
a926842
Compare
|
Thanks for the review! Addressed in the latest commit:
Also added an adoption regression test for the stale-key scrub, and fixed the |
|
Thanks for the follow-up review. Addressed both comments in
|
|
@tomergee: The label(s) DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/priority critical-urgent |
…ations A Sandbox's spec.podTemplate metadata was propagated verbatim to the backing Pod, including system-reserved keys. A tenant could set agents.x-k8s.io/sandbox-name-hash in their template to match another Sandbox's headless Service selector and hijack its traffic, or forge warm-pool tracking labels and controller-managed annotations. The controller now treats any agents.x-k8s.io/ or extensions.agents.x-k8s.io/ label/annotation key (plus the trace-context annotation) as system-reserved and filters it out of user-supplied PodTemplate metadata on both the create (reconcilePod) and adoption (updatePodMetadata) paths: - The sandbox-name-hash label is assigned after merging user labels so it can never be overridden. - Unauthorized system labels already present on an adopted Pod are removed. - Warm-pool tracking labels are only propagated for Sandboxes owned by a trusted extension controller (SandboxWarmPool/SandboxClaim), preventing spoofing via a directly-created Sandbox. Adds a security regression test, refactors TestReconcile to compute the name hash dynamically instead of a hardcoded value, and documents the threat in docs/security/threat_model.md. The FNV-1a name hash is unchanged.
- Centralize the warm-pool tracking-label key list shared by the create and adoption paths to avoid drift. - Require a controller owner reference (controller=true) under the extensions API group before propagating warm-pool tracking labels, and document the OwnerReferencesPermissionEnforcement assumption near the helper. - During adoption/update, scrub system-reserved keys that an older (vulnerable) controller may have recorded in the propagated-labels/propagated-annotations lists, keeping the controller-owned name-hash label, the allowed tracking labels on extension-managed Sandboxes, and controller-managed annotations. - Make docs/security/threat_model.md precise about the scrub behavior. - Add an adoption regression test for the stale-key scrub.
Address review feedback on warm-pool tracking label propagation: - updatePodMetadata now keeps the warm-pool tracking labels in sync instead of only ever adding them. For controller-managed Sandboxes the labels are set/updated from the Sandbox CR; otherwise (or when the Sandbox drops a tracking label) they are removed from the Pod. This prevents stale or spoofed tracking labels from persisting across ownership/label transitions. - Add tests covering the gating logic: an extension-managed Sandbox propagates the tracking labels to its Pod, a tenant-owned Sandbox with the same labels does not, and a stale tracking label is scrubbed on adoption when the Sandbox is not extension-managed.
ae8db28 to
74273a8
Compare
- Downgrade scrub logging to V(1) to avoid flooding cluster logs during large rollouts/adoptions. - Replace the mutable warmPoolTrackingLabels slice with an immutable warmPoolTrackingLabelSet map for membership checks and iteration. - Parse ownerRef APIVersion with schema.ParseGroupVersion and compare the extensions group exactly; centralize allowed owner kinds in a map.
Per review: the core Sandbox controller should not encode warm-pool/claim ownerRef logic or propagate tracking labels from Sandbox.metadata.labels. Keep the security fix only: - filter system-reserved keys from spec.podTemplate on create and update - assign sandbox-name-hash after merging user labels - scrub stale propagated system keys on adoption (except name-hash and controller-managed annotations) Remove extension-specific helpers, sync logic, and gating tests. Add a test that system labels on Sandbox.metadata are not copied to Pods. Update the threat model accordingly.
barney-s
left a comment
There was a problem hiding this comment.
Thank you for simplifying the code and removing coupling b/w core and extension controllers.
+1 to removing reserved keys from existing pod labels and annotations.
| var managedLabelKeys []string | ||
| for k, v := range sandbox.Spec.PodTemplate.ObjectMeta.Labels { | ||
| // Never let a user-supplied template set system-reserved labels. | ||
| if isSystemLabel(k) { |
There was a problem hiding this comment.
Optimization: Consider pre-allocating the managedLabelKeys slice capacity to avoid overhead from dynamic re-allocations as elements are appended inside the loop:
|
/lgtm minor comment left. don't want to hold the PR for that. |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: barney-s, tomergee The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
…verride (kubernetes-sigs#894) * fix: prevent tenants from overriding system-reserved Pod labels/annotations A Sandbox's spec.podTemplate metadata was propagated verbatim to the backing Pod, including system-reserved keys. A tenant could set agents.x-k8s.io/sandbox-name-hash in their template to match another Sandbox's headless Service selector and hijack its traffic, or forge warm-pool tracking labels and controller-managed annotations. The controller now treats any agents.x-k8s.io/ or extensions.agents.x-k8s.io/ label/annotation key (plus the trace-context annotation) as system-reserved and filters it out of user-supplied PodTemplate metadata on both the create (reconcilePod) and adoption (updatePodMetadata) paths: - The sandbox-name-hash label is assigned after merging user labels so it can never be overridden. - Unauthorized system labels already present on an adopted Pod are removed. - Warm-pool tracking labels are only propagated for Sandboxes owned by a trusted extension controller (SandboxWarmPool/SandboxClaim), preventing spoofing via a directly-created Sandbox. Adds a security regression test, refactors TestReconcile to compute the name hash dynamically instead of a hardcoded value, and documents the threat in docs/security/threat_model.md. The FNV-1a name hash is unchanged. * fix: address review feedback on system-label protection - Centralize the warm-pool tracking-label key list shared by the create and adoption paths to avoid drift. - Require a controller owner reference (controller=true) under the extensions API group before propagating warm-pool tracking labels, and document the OwnerReferencesPermissionEnforcement assumption near the helper. - During adoption/update, scrub system-reserved keys that an older (vulnerable) controller may have recorded in the propagated-labels/propagated-annotations lists, keeping the controller-owned name-hash label, the allowed tracking labels on extension-managed Sandboxes, and controller-managed annotations. - Make docs/security/threat_model.md precise about the scrub behavior. - Add an adoption regression test for the stale-key scrub. * Sync warm-pool tracking labels and add gating tests Address review feedback on warm-pool tracking label propagation: - updatePodMetadata now keeps the warm-pool tracking labels in sync instead of only ever adding them. For controller-managed Sandboxes the labels are set/updated from the Sandbox CR; otherwise (or when the Sandbox drops a tracking label) they are removed from the Pod. This prevents stale or spoofed tracking labels from persisting across ownership/label transitions. - Add tests covering the gating logic: an extension-managed Sandbox propagates the tracking labels to its Pod, a tenant-owned Sandbox with the same labels does not, and a stale tracking label is scrubbed on adoption when the Sandbox is not extension-managed. * Address latest Copilot review on label protection - Downgrade scrub logging to V(1) to avoid flooding cluster logs during large rollouts/adoptions. - Replace the mutable warmPoolTrackingLabels slice with an immutable warmPoolTrackingLabelSet map for membership checks and iteration. - Parse ownerRef APIVersion with schema.ParseGroupVersion and compare the extensions group exactly; centralize allowed owner kinds in a map. * Simplify label protection: drop extension coupling Per review: the core Sandbox controller should not encode warm-pool/claim ownerRef logic or propagate tracking labels from Sandbox.metadata.labels. Keep the security fix only: - filter system-reserved keys from spec.podTemplate on create and update - assign sandbox-name-hash after merging user labels - scrub stale propagated system keys on adoption (except name-hash and controller-managed annotations) Remove extension-specific helpers, sync logic, and gating tests. Add a test that system labels on Sandbox.metadata are not copied to Pods. Update the threat model accordingly. * Address review: centralize prefix check, scrub spoofed trace-context
…verride (kubernetes-sigs#894) * fix: prevent tenants from overriding system-reserved Pod labels/annotations A Sandbox's spec.podTemplate metadata was propagated verbatim to the backing Pod, including system-reserved keys. A tenant could set agents.x-k8s.io/sandbox-name-hash in their template to match another Sandbox's headless Service selector and hijack its traffic, or forge warm-pool tracking labels and controller-managed annotations. The controller now treats any agents.x-k8s.io/ or extensions.agents.x-k8s.io/ label/annotation key (plus the trace-context annotation) as system-reserved and filters it out of user-supplied PodTemplate metadata on both the create (reconcilePod) and adoption (updatePodMetadata) paths: - The sandbox-name-hash label is assigned after merging user labels so it can never be overridden. - Unauthorized system labels already present on an adopted Pod are removed. - Warm-pool tracking labels are only propagated for Sandboxes owned by a trusted extension controller (SandboxWarmPool/SandboxClaim), preventing spoofing via a directly-created Sandbox. Adds a security regression test, refactors TestReconcile to compute the name hash dynamically instead of a hardcoded value, and documents the threat in docs/security/threat_model.md. The FNV-1a name hash is unchanged. * fix: address review feedback on system-label protection - Centralize the warm-pool tracking-label key list shared by the create and adoption paths to avoid drift. - Require a controller owner reference (controller=true) under the extensions API group before propagating warm-pool tracking labels, and document the OwnerReferencesPermissionEnforcement assumption near the helper. - During adoption/update, scrub system-reserved keys that an older (vulnerable) controller may have recorded in the propagated-labels/propagated-annotations lists, keeping the controller-owned name-hash label, the allowed tracking labels on extension-managed Sandboxes, and controller-managed annotations. - Make docs/security/threat_model.md precise about the scrub behavior. - Add an adoption regression test for the stale-key scrub. * Sync warm-pool tracking labels and add gating tests Address review feedback on warm-pool tracking label propagation: - updatePodMetadata now keeps the warm-pool tracking labels in sync instead of only ever adding them. For controller-managed Sandboxes the labels are set/updated from the Sandbox CR; otherwise (or when the Sandbox drops a tracking label) they are removed from the Pod. This prevents stale or spoofed tracking labels from persisting across ownership/label transitions. - Add tests covering the gating logic: an extension-managed Sandbox propagates the tracking labels to its Pod, a tenant-owned Sandbox with the same labels does not, and a stale tracking label is scrubbed on adoption when the Sandbox is not extension-managed. * Address latest Copilot review on label protection - Downgrade scrub logging to V(1) to avoid flooding cluster logs during large rollouts/adoptions. - Replace the mutable warmPoolTrackingLabels slice with an immutable warmPoolTrackingLabelSet map for membership checks and iteration. - Parse ownerRef APIVersion with schema.ParseGroupVersion and compare the extensions group exactly; centralize allowed owner kinds in a map. * Simplify label protection: drop extension coupling Per review: the core Sandbox controller should not encode warm-pool/claim ownerRef logic or propagate tracking labels from Sandbox.metadata.labels. Keep the security fix only: - filter system-reserved keys from spec.podTemplate on create and update - assign sandbox-name-hash after merging user labels - scrub stale propagated system keys on adoption (except name-hash and controller-managed annotations) Remove extension-specific helpers, sync logic, and gating tests. Add a test that system labels on Sandbox.metadata are not copied to Pods. Update the threat model accordingly. * Address review: centralize prefix check, scrub spoofed trace-context
…verride (kubernetes-sigs#894) * fix: prevent tenants from overriding system-reserved Pod labels/annotations A Sandbox's spec.podTemplate metadata was propagated verbatim to the backing Pod, including system-reserved keys. A tenant could set agents.x-k8s.io/sandbox-name-hash in their template to match another Sandbox's headless Service selector and hijack its traffic, or forge warm-pool tracking labels and controller-managed annotations. The controller now treats any agents.x-k8s.io/ or extensions.agents.x-k8s.io/ label/annotation key (plus the trace-context annotation) as system-reserved and filters it out of user-supplied PodTemplate metadata on both the create (reconcilePod) and adoption (updatePodMetadata) paths: - The sandbox-name-hash label is assigned after merging user labels so it can never be overridden. - Unauthorized system labels already present on an adopted Pod are removed. - Warm-pool tracking labels are only propagated for Sandboxes owned by a trusted extension controller (SandboxWarmPool/SandboxClaim), preventing spoofing via a directly-created Sandbox. Adds a security regression test, refactors TestReconcile to compute the name hash dynamically instead of a hardcoded value, and documents the threat in docs/security/threat_model.md. The FNV-1a name hash is unchanged. * fix: address review feedback on system-label protection - Centralize the warm-pool tracking-label key list shared by the create and adoption paths to avoid drift. - Require a controller owner reference (controller=true) under the extensions API group before propagating warm-pool tracking labels, and document the OwnerReferencesPermissionEnforcement assumption near the helper. - During adoption/update, scrub system-reserved keys that an older (vulnerable) controller may have recorded in the propagated-labels/propagated-annotations lists, keeping the controller-owned name-hash label, the allowed tracking labels on extension-managed Sandboxes, and controller-managed annotations. - Make docs/security/threat_model.md precise about the scrub behavior. - Add an adoption regression test for the stale-key scrub. * Sync warm-pool tracking labels and add gating tests Address review feedback on warm-pool tracking label propagation: - updatePodMetadata now keeps the warm-pool tracking labels in sync instead of only ever adding them. For controller-managed Sandboxes the labels are set/updated from the Sandbox CR; otherwise (or when the Sandbox drops a tracking label) they are removed from the Pod. This prevents stale or spoofed tracking labels from persisting across ownership/label transitions. - Add tests covering the gating logic: an extension-managed Sandbox propagates the tracking labels to its Pod, a tenant-owned Sandbox with the same labels does not, and a stale tracking label is scrubbed on adoption when the Sandbox is not extension-managed. * Address latest Copilot review on label protection - Downgrade scrub logging to V(1) to avoid flooding cluster logs during large rollouts/adoptions. - Replace the mutable warmPoolTrackingLabels slice with an immutable warmPoolTrackingLabelSet map for membership checks and iteration. - Parse ownerRef APIVersion with schema.ParseGroupVersion and compare the extensions group exactly; centralize allowed owner kinds in a map. * Simplify label protection: drop extension coupling Per review: the core Sandbox controller should not encode warm-pool/claim ownerRef logic or propagate tracking labels from Sandbox.metadata.labels. Keep the security fix only: - filter system-reserved keys from spec.podTemplate on create and update - assign sandbox-name-hash after merging user labels - scrub stale propagated system keys on adoption (except name-hash and controller-managed annotations) Remove extension-specific helpers, sync logic, and gating tests. Add a test that system labels on Sandbox.metadata are not copied to Pods. Update the threat model accordingly. * Address review: centralize prefix check, scrub spoofed trace-context
Summary
A Sandbox's
spec.podTemplatemetadata was propagated verbatim to the backing Pod, including system-reserved keys. A tenant could setagents.x-k8s.io/sandbox-name-hashin their template to match another Sandbox's headless Service selector and hijack its traffic, or forge system-prefixed labels / controller-managed annotations.This PR makes the core controller treat any
agents.x-k8s.io/orextensions.agents.x-k8s.io/label/annotation key (plus the trace-context annotation) as system-reserved and filter it out of user-supplied PodTemplate metadata on both the create (reconcilePod) and adoption (updatePodMetadata) paths:sandbox-name-hashlabel is assigned after merging user labels, so it can never be overridden.propagated-labels/propagated-annotationslists are scrubbed on adoption/update.Sandbox.metadata.labelsare not copied to Pods — extension controllers own their own Sandbox CR lifecycle.Adds security regression tests and documents the threat in
docs/security/threat_model.md.Relationship to #784: #784 (merged) requires
agents.x-k8s.io/adoptable: "true"before adopting unowned Pod/Service/PVC resources. This PR closes the complementary gap where tenants could still inject system-reserved keys throughspec.podTemplate. The core controller intentionally does not add extension ownerRef or warm-pool tracking logic.Scope note: FNV-1a name hash is unchanged; SHA-256 strengthening remains a follow-up PR.
Test plan
go build ./...go vet ./controllers/...go test ./controllers/...