Skip to content

feat: implement agent_sandbox_creation_latency_ms metric#425

Merged
k8s-ci-robot merged 7 commits into
kubernetes-sigs:mainfrom
chw120:feature-agent-sandbox-creation-latency-2507448849294936923
Mar 19, 2026
Merged

feat: implement agent_sandbox_creation_latency_ms metric#425
k8s-ci-robot merged 7 commits into
kubernetes-sigs:mainfrom
chw120:feature-agent-sandbox-creation-latency-2507448849294936923

Conversation

@chw120

@chw120 chw120 commented Mar 17, 2026

Copy link
Copy Markdown
Contributor

This PR introduces the agent_sandbox_creation_latency_ms metric to track the time it takes from a Sandbox's creation to when its associated Pod reaches a Ready state. This provides better visibility into the performance of sandbox provisioning across different launch types and templates.

Key Changes

  • New Prometheus Histogram: Defined SandboxCreationLatency as a histogram vector in internal/metrics/metrics.go.

    • Buckets: Configured with ranges from 50ms up to 4 minutes (50, 100, 250, 500, 1000, 2500, 5000, 10000, 30000, 60000, 120000, 240000).
    • Labels: Includes launch_type ("warm", "cold", "unknown") and sandbox_template to allow granular analysis.

    Note: I did not add "status" defined in the original requirement, since it would be always success.

  • Controller Integration: Updated the SandboxClaimReconciler in extensions/controllers/sandboxclaim_controller.go to calculate and record this latency during the reconciliation process. It specifically measures the duration between sandbox.CreationTimestamp and the Pod's LastTransitionTime for the Ready condition.

  • Helper Method: Added RecordSandboxCreationLatency to the internal metrics package for clean invocation from controllers.

Testing

  • Unit Tests: Added TestSandboxLatencyRecording in internal/metrics/metrics_test.go to ensure that observations are correctly recorded and grouped by their respective launch type labels.

Working on #245

@netlify

netlify Bot commented Mar 17, 2026

Copy link
Copy Markdown

Deploy Preview for agent-sandbox canceled.

Name Link
🔨 Latest commit a1e578b
🔍 Latest deploy log https://app.netlify.com/projects/agent-sandbox/deploys/69bc2a6292483500082f7749
@linux-foundation-easycla

linux-foundation-easycla Bot commented Mar 17, 2026

Copy link
Copy Markdown

CLA Signed

The committers listed above are authorized under a signed CLA.

@k8s-ci-robot

Copy link
Copy Markdown
Contributor

Hi @chw120. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work.

Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Mar 17, 2026
Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
@chw120 chw120 force-pushed the feature-agent-sandbox-creation-latency-2507448849294936923 branch from bfaba5a to 5e6e1aa Compare March 17, 2026 02:27
@chw120

chw120 commented Mar 17, 2026

Copy link
Copy Markdown
Contributor Author

/easycla

1 similar comment
@chw120

chw120 commented Mar 17, 2026

Copy link
Copy Markdown
Contributor Author

/easycla

@yongruilin

Copy link
Copy Markdown
Contributor

@chw120 I don't think Jules signed the CLA . You might need to configured the Jules to use "User Only" mode to get rid of the co-authored.
/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Mar 17, 2026
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Mar 17, 2026
Comment thread extensions/controllers/sandboxclaim_controller.go Outdated
Comment thread extensions/controllers/sandboxclaim_controller.go Outdated
Comment thread extensions/controllers/sandboxclaim_controller.go Outdated
Comment thread internal/metrics/metrics_test.go Outdated
Comment thread internal/metrics/metrics.go Outdated
Comment thread internal/metrics/metrics.go
@aditya-shantanu

Copy link
Copy Markdown
Collaborator

/assign igooch

Comment thread internal/metrics/metrics.go
Comment thread internal/metrics/metrics.go
Comment thread extensions/controllers/sandboxclaim_controller.go Outdated

@igooch igooch left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 19, 2026
@k8s-ci-robot

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: chw120, igooch

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 19, 2026
@k8s-ci-robot k8s-ci-robot merged commit a2d311d into kubernetes-sigs:main Mar 19, 2026
10 checks passed
nadolskit pushed a commit to nadolskit/agent-sandbox that referenced this pull request Mar 20, 2026
…sigs#425)

* feat: implement agent_sandbox_creation_latency_ms metric

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>

* fix(lint): remove redundant Time selector in sandboxclaim controller

* address review comments

* address review comments

* added namespace and used early returns

* fix: format

---------

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
nadolskit pushed a commit to nadolskit/agent-sandbox that referenced this pull request Mar 20, 2026
…sigs#425)

* feat: implement agent_sandbox_creation_latency_ms metric

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>

* fix(lint): remove redundant Time selector in sandboxclaim controller

* address review comments

* address review comments

* added namespace and used early returns

* fix: format

---------

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

5 participants