Skip to content

Use a Deployment instead of a StatefulSet for the controller#191

Merged
k8s-ci-robot merged 1 commit into
kubernetes-sigs:mainfrom
antonipp:ai/sts-to-deployment
Feb 24, 2026
Merged

Use a Deployment instead of a StatefulSet for the controller#191
k8s-ci-robot merged 1 commit into
kubernetes-sigs:mainfrom
antonipp:ai/sts-to-deployment

Conversation

@antonipp

@antonipp antonipp commented Nov 27, 2025

Copy link
Copy Markdown
Contributor

Description

While trying out the agent-sandbox project, I noticed that the main controller is deployed as a StatefulSet. I couldn't really find a justification for it in the commit history, so this looks like an oversight. The controller is stateless and doesn't require a stable network identity, so I don't really see a reason to use a much heavier abstraction here. Moreover, STS ordered rollout semantics are unnecessary for a stateless controller and could complicate scaling/updates if leader election is enabled for HA.

So my PR switches the controller to be deployed as a simple Deployment, I think it will make operations easier for everyone

Migration Guide

The controller changed from StatefulSet to Deployment and leader election is now enabled by default. Before you deploy the new charts, you need to clean-up the existing StatefulSet (this will cause a brief disruption to Sandbox state reconciliation):

kubectl delete statefulset agent-sandbox-controller -n agent-sandbox-system

Then deploy the new charts. Verify that the Deployment has been properly created:

kubectl get deployment agent-sandbox-controller -n agent-sandbox-system

If needed, the rollback steps are:

kubectl delete deployment agent-sandbox-controller -n agent-sandbox-system
kubectl apply -f <previous-version-manifest.yaml>
@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Nov 27, 2025
@netlify

netlify Bot commented Nov 27, 2025

Copy link
Copy Markdown

Deploy Preview for agent-sandbox canceled.

Name Link
🔨 Latest commit 91e6a31
🔍 Latest deploy log https://app.netlify.com/projects/agent-sandbox/deploys/69957f4a11fea40008faea52
@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Nov 27, 2025
@k8s-ci-robot

Copy link
Copy Markdown
Contributor

Hi @antonipp. Thanks for your PR.

I'm waiting for a github.com member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Nov 27, 2025
@janetkuo

janetkuo commented Dec 2, 2025

Copy link
Copy Markdown
Member

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Dec 2, 2025
Comment thread k8s/controller.yaml
@barney-s

barney-s commented Dec 2, 2025

Copy link
Copy Markdown
Collaborator

This is an acceptable change. We can switch back to sts later if needed. Would you check the docs and check if they need a change.

@barney-s barney-s left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes

Comment thread k8s/controller.yaml
Comment thread k8s/controller.yaml
Comment thread examples/python-runtime-sandbox/run-test-kind.sh Outdated
@antonipp antonipp force-pushed the ai/sts-to-deployment branch from aa548c2 to 1571613 Compare December 3, 2025 12:27
@antonipp

antonipp commented Dec 3, 2025

Copy link
Copy Markdown
Contributor Author

/retest

@janetkuo

Copy link
Copy Markdown
Member

/label release-note-action-required

@k8s-ci-robot

Copy link
Copy Markdown
Contributor

@janetkuo: The label(s) /label release-note-action-required cannot be applied. These labels are supported: api-review, tide/merge-method-merge, tide/merge-method-rebase, tide/merge-method-squash, team/katacoda, refactor, ci-short, ci-extended, ci-full. Is this label configured under labels -> additional_labels or labels -> restricted_labels in plugin.yaml?

Details

In response to this:

/label release-note-action-required

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@janetkuo janetkuo added the release-note-action-required Denotes a PR that introduces potentially breaking changes that require user action. label Dec 23, 2025
Comment thread k8s/controller.yaml
Comment thread examples/python-runtime-sandbox/run-test-kind.sh
Comment thread examples/python-runtime-sandbox/run-test-kind.sh
Comment thread examples/python-runtime-sandbox/run-test-kind.sh
Comment thread k8s/controller.yaml
@janetkuo

janetkuo commented Feb 3, 2026

Copy link
Copy Markdown
Member

/retest

@janetkuo

janetkuo commented Feb 3, 2026

Copy link
Copy Markdown
Member

@antonipp it seems that the e2e tests are broken, would you fix them?

@antonipp antonipp force-pushed the ai/sts-to-deployment branch from 8fbc2d5 to 72fef4b Compare February 3, 2026 10:35
@antonipp

antonipp commented Feb 3, 2026

Copy link
Copy Markdown
Contributor Author

Right, looks like there was an issue reconciling an existing Service:

        	            	timed out waiting for object: ValidateObject *v1alpha1.Sandbox (sandbox-shutdown-test-1770086557148676937/my-sandbox): unexpected sandbox status (-want,+got):
        	            	  v1alpha1.SandboxStatus{
        	            	- 	ServiceFQDN: "my-sandbox.sandbox-shutdown-test-1770086557148676937.svc.cluster.local",
        	            	+ 	ServiceFQDN: "",
        	            	- 	Service:     "my-sandbox",
        	            	+ 	Service:     "",

I fixed the logic by ensuring that it's always reconciled, I think it should do the trick.

@antonipp antonipp force-pushed the ai/sts-to-deployment branch from 72fef4b to e7af63f Compare February 3, 2026 13:57
@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Feb 3, 2026
@antonipp

antonipp commented Feb 3, 2026

Copy link
Copy Markdown
Contributor Author

Looks like there was an RBAC issue after I enabled leader election as well, I fixed it too, the tests now seem to pass

Comment thread controllers/sandbox_controller.go Outdated
Comment thread controllers/sandbox_controller.go Outdated
Signed-off-by: Anton Ippolitov <anton.ippolitov@datadoghq.com>
@antonipp antonipp force-pushed the ai/sts-to-deployment branch from adf9789 to 91e6a31 Compare February 18, 2026 08:58
@aditya-shantanu

Copy link
Copy Markdown
Collaborator

/lgtm

@k8s-ci-robot

Copy link
Copy Markdown
Contributor

@aditya-shantanu: changing LGTM is restricted to collaborators

Details

In response to this:

/lgtm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@janetkuo janetkuo left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 24, 2026
@k8s-ci-robot

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: antonipp, janetkuo

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 24, 2026
@k8s-ci-robot k8s-ci-robot merged commit 314cabc into kubernetes-sigs:main Feb 24, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note-action-required Denotes a PR that introduces potentially breaking changes that require user action. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

6 participants