Skip to content

fix(workload): resolve Kubernetes v0.15+ condition sorting issue#1149

Merged
jamOne- merged 8 commits into
AI-Hypercomputer:mainfrom
jamOne-:list-fix
Mar 20, 2026
Merged

fix(workload): resolve Kubernetes v0.15+ condition sorting issue#1149
jamOne- merged 8 commits into
AI-Hypercomputer:mainfrom
jamOne-:list-fix

Conversation

@jamOne-

@jamOne- jamOne- commented Mar 19, 2026

Copy link
Copy Markdown
Collaborator

Description

Previously, XPK logic for fetching workload status relied on grabbing the last item in the .status.conditions array (e.g., conditions[-1]). However, starting with Kueue v0.15+ (and due to Kubernetes Server-Side Apply map sorting behavior), conditions are now sorted alphabetically by their type field rather than chronologically. As a result, critical statuses like Finished might appear before QuotaReserved in the JSON array, causing xpk workload list and wait logic to report stale or incorrect status types.

Issue

b/494200808

Testing

  • unit tests
@jamOne- jamOne- force-pushed the list-fix branch 2 times, most recently from 840f1be to 53e163c Compare March 19, 2026 15:05
- Replace brittle `[-1]` array indices when querying Kubernetes `status.conditions`.
- Update `_parse_workload_item` to dynamically determine the latest condition by comparing `lastTransitionTime` (unconditionally).
- Refactor `kubectl wait` to use native order-agnostic flag `--for=condition=Finished`.
- Change `kubectl get jobset` to parse full JSON via Python instead of fragile JSONPath.
- Update tests to match new command structures.
jamOne- added 2 commits March 19, 2026 16:44
- Created  to encapsulate timestamp-based sorting.
- Extracted JobSet JSON parsing into  utilizing the new helper function.
- Cleaned up  and .
- Added new test coverage for  success and failure cases.
- Improved  mock assertions.
- Introduced `_KubernetesStatus` and `_KubernetesCondition` minimal dataclasses.
- Added `_parse_kubernetes_status` to encapsulate raw JSON to dataclass mapping, ensuring fields strictly remain `None` when empty/missing.
- Refactored `_get_latest_condition`, `_parse_workload_item`, and `_get_jobset_status` to strictly consume these dataclasses instead of iterating through dicts directly.
@jamOne- jamOne- marked this pull request as ready for review March 20, 2026 09:22

@SikaGrr SikaGrr left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alphabetically sorted, ouch!

@jamOne- jamOne- added this pull request to the merge queue Mar 20, 2026
Merged via the queue into AI-Hypercomputer:main with commit c4add69 Mar 20, 2026
14 checks passed
@jamOne- jamOne- deleted the list-fix branch March 20, 2026 11:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

2 participants