fix(config): auto-detect BurstGPT CSV in DatasetResolver and pin fixed_schedule regression#984
Conversation
…d_schedule regression Signed-off-by: lkomali <lkomali@nvidia.com>
Try out this PRQuick install: pip install --upgrade --force-reinstall git+https://github.com/ai-dynamo/aiperf.git@75b717a1425ecb7a0591b91c2daf850e6c56f559Recommended with virtual environment (using uv): uv venv --python 3.12 && source .venv/bin/activate
uv pip install --upgrade --force-reinstall git+https://github.com/ai-dynamo/aiperf.git@75b717a1425ecb7a0591b91c2daf850e6c56f559Last updated for commit: |
WalkthroughRefactors DatasetResolver JSONL probing into ChangesDataset Type Detection and CSV Handling
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/integration/test_burst_gpt_trace.py (1)
110-112:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winAssert outputs in the auto-detect path as well.
The explicit-type test checks
has_all_outputs, but the auto-detect regression test currently does not. Adding the same assertion improves end-to-end parity.💡 Proposed fix
assert result.exit_code == 0 assert result.request_count == len(rows) + assert result.has_all_outputs🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/integration/test_burst_gpt_trace.py` around lines 110 - 112, The auto-detect regression test is missing the parity check for outputs: after the existing assertions (assert result.exit_code == 0 and assert result.request_count == len(rows)) add the same end-to-end output assertion used in the explicit-type test by asserting result.has_all_outputs to ensure all expected outputs were produced; locate this in the auto-detect test in tests/integration/test_burst_gpt_trace.py and insert the assert for result.has_all_outputs immediately after the other assertions.
🧹 Nitpick comments (2)
tests/integration/test_burst_gpt_trace.py (1)
46-51: ⚡ Quick winAdd explicit return type hints to async test methods.
Both async test functions should declare
-> Noneto match the repo-wide typing rule.💡 Proposed fix
async def test_fixed_schedule_with_explicit_dataset_type( self, cli: AIPerfCLI, aiperf_mock_server: AIPerfMockServer, tmp_path: Path, - ): + ) -> None: @@ async def test_fixed_schedule_auto_detected( self, cli: AIPerfCLI, aiperf_mock_server: AIPerfMockServer, tmp_path: Path, - ): + ) -> None:As per coding guidelines: Type hints on ALL functions (parameters and return types).
Also applies to: 81-86
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/integration/test_burst_gpt_trace.py` around lines 46 - 51, Add explicit return type hints "-> None" to the async test function definitions: update the signature of test_fixed_schedule_with_explicit_dataset_type and the other async test function in this file to declare their return type as None (e.g., "async def test_fixed_schedule_with_explicit_dataset_type(... ) -> None:") to satisfy the repository typing rule for all functions.tests/integration/utils.py (1)
136-145: ⚡ Quick winValidate required BurstGPT columns in the fixture helper.
The helper docstring promises required columns, but that contract is not enforced. Add explicit validation so malformed fixtures fail early with clear errors.
💡 Proposed fix
if not rows: raise ValueError("rows must not be empty") - fieldnames = list(rows[0].keys()) + required_columns = {"Timestamp", "Request tokens", "Response tokens"} + first_keys = set(rows[0].keys()) + missing = required_columns - first_keys + if missing: + raise ValueError( + f"rows[0] missing required BurstGPT columns: {sorted(missing)}" + ) + + fieldnames = list(rows[0].keys()) + fieldnames_set = set(fieldnames) csv_path = tmp_path / filename with open(csv_path, "w", newline="", encoding="utf-8") as f: writer = csv.DictWriter(f, fieldnames=fieldnames) writer.writeheader() for row in rows: + extra = set(row.keys()) - fieldnames_set + if extra: + raise ValueError( + f"row has unexpected columns not present in header: {sorted(extra)}" + ) writer.writerow(row)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/integration/utils.py` around lines 136 - 145, The CSV fixture helper currently accepts rows without enforcing the required BurstGPT columns; add an explicit validation after computing fieldnames to ensure required_columns (define a list like required_columns = ["<COLUMN1>", "<COLUMN2>", ...] replaced with the actual BurstGPT-required column names) are all present in fieldnames, and if any are missing raise a ValueError that lists the missing columns; apply this check in the same function around the rows/fieldnames variables (before opening csv_path and before creating the DictWriter) so malformed fixtures fail fast with a clear error message.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/aiperf/config/dataset/resolver.py`:
- Around line 225-245: The helper _read_first_jsonl_record currently returns any
JSON value which can be a non-dict and later break callers that expect to call
.get(); update _read_first_jsonl_record to parse the first non-blank line with
load_json_str but only return the parsed value when it is an instance of dict,
otherwise return None; keep the existing behavior of propagating OSError and
still returning None for non-JSON first lines by catching ValueError as before.
---
Outside diff comments:
In `@tests/integration/test_burst_gpt_trace.py`:
- Around line 110-112: The auto-detect regression test is missing the parity
check for outputs: after the existing assertions (assert result.exit_code == 0
and assert result.request_count == len(rows)) add the same end-to-end output
assertion used in the explicit-type test by asserting result.has_all_outputs to
ensure all expected outputs were produced; locate this in the auto-detect test
in tests/integration/test_burst_gpt_trace.py and insert the assert for
result.has_all_outputs immediately after the other assertions.
---
Nitpick comments:
In `@tests/integration/test_burst_gpt_trace.py`:
- Around line 46-51: Add explicit return type hints "-> None" to the async test
function definitions: update the signature of
test_fixed_schedule_with_explicit_dataset_type and the other async test function
in this file to declare their return type as None (e.g., "async def
test_fixed_schedule_with_explicit_dataset_type(... ) -> None:") to satisfy the
repository typing rule for all functions.
In `@tests/integration/utils.py`:
- Around line 136-145: The CSV fixture helper currently accepts rows without
enforcing the required BurstGPT columns; add an explicit validation after
computing fieldnames to ensure required_columns (define a list like
required_columns = ["<COLUMN1>", "<COLUMN2>", ...] replaced with the actual
BurstGPT-required column names) are all present in fieldnames, and if any are
missing raise a ValueError that lists the missing columns; apply this check in
the same function around the rows/fieldnames variables (before opening csv_path
and before creating the DictWriter) so malformed fixtures fail fast with a clear
error message.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 6b868e94-47bf-4ecb-929f-458b702e3827
📒 Files selected for processing (4)
src/aiperf/config/dataset/resolver.pytests/integration/test_burst_gpt_trace.pytests/integration/utils.pytests/unit/config/test_resolver_edge_cases.py
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
ajcasagrande
left a comment
There was a problem hiding this comment.
The fix looks functionally sound: the new resolver path lets BurstGPT CSVs reach structural loader detection, and both targeted integration tests plus a manual aiperf profile run against the in-repo mock server pass. I found one low-severity cleanup item: the expected non-JSON CSV probe still goes through load_json_str, so successful auto-detection emits a scary JSON parse warning before continuing. I’d fix that first; no correctness blocker found.
…JSON-parse warning Signed-off-by: lkomali <lkomali@nvidia.com>
…SV probe Signed-off-by: lkomali <lkomali@nvidia.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/aiperf/config/dataset/resolver.py`:
- Around line 246-248: The file-reading in
DatasetResolver._read_first_jsonl_record currently calls open(file_path) which
uses the system locale and can cause inconsistent UnicodeDecodeError; change the
open call to explicitly pass encoding="utf-8" (e.g., open(file_path,
encoding="utf-8")) so the probe decoding is pinned to UTF‑8 when reading the
first JSONL record and subsequent lines.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 65684e59-46fb-415b-b764-ca0c3090b0c0
📒 Files selected for processing (3)
src/aiperf/config/dataset/resolver.pytests/integration/test_burst_gpt_trace.pytests/unit/config/test_resolver_edge_cases.py
ajcasagrande
left a comment
There was a problem hiding this comment.
Reviewed for correctness against origin/main; no blocking findings. Focused resolver unit coverage passed, and the new BurstGPT integration tests passed when selected with the integration marker.
…minism Signed-off-by: lkomali <lkomali@nvidia.com>
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/aiperf/config/dataset/resolver.py (1)
246-247: 🏗️ Heavy liftAvoid extending the synchronous file-I/O path here.
open(...)keeps the JSONL probe on blocking I/O in the exact path this PR touches. Please route this read through an async helper instead of adding another synchronous file access point inDatasetResolver.As per coding guidelines,
**/*.py: Use async/await for ALL I/O operations - notime.sleep, no blocking calls.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/aiperf/config/dataset/resolver.py` around lines 246 - 247, The synchronous open(...) call in DatasetResolver (the method that probes/parses the JSONL at file_path) must be replaced with an async file-read helper so we avoid blocking I/O; modify the method to await an async reader (e.g., use an existing project helper like async_read_text or use aiofiles.open within an async context) to read the file contents instead of using open(...), keep the rest of the parsing logic identical, and ensure the method signature remains async so callers can await it.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@src/aiperf/config/dataset/resolver.py`:
- Around line 246-247: The synchronous open(...) call in DatasetResolver (the
method that probes/parses the JSONL at file_path) must be replaced with an async
file-read helper so we avoid blocking I/O; modify the method to await an async
reader (e.g., use an existing project helper like async_read_text or use
aiofiles.open within an async context) to read the file contents instead of
using open(...), keep the rest of the parsing logic identical, and ensure the
method signature remains async so callers can await it.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 2af67c68-4ff7-4371-892e-334e999cb020
📒 Files selected for processing (1)
src/aiperf/config/dataset/resolver.py
dynamo-ops
left a comment
There was a problem hiding this comment.
Previous review comments have been addressed. Approving.
Summary
DatasetResolver._detect_typebailed out of structural detection when the first line of an input file wasn't valid JSON. BurstGPT is the only CSV-format loader in the tree, so dropping a bare BurstGPT CSV (no--custom-dataset-type) tripped theValueErroron the CSV header and returned(None, None)— even thoughBurstGPTTraceDatasetLoader.can_loadrecognizes the format from its header columns on its own. Extracted a small_read_first_jsonl_recordhelper that returnsNone(instead of bailing) on a non-JSON first line so loaders' filename-basedcan_loadchecks still get a chance.OSErrorstill bails the whole detection — that case genuinely cannot proceed.burst_gpt_trace + --fixed-schedule: one integration test with explicit--custom-dataset-type, one with auto-detect (the second pins the_detect_typefix specifically). Plus two unit tests intest_resolver_edge_cases.pycovering the_detect_typecontract and the full-resolverhas_timingpass for a bare CSV.Note: the related
_check_timing_datafix for BurstGPT already shipped to main in83bdcd92("fix the nightly..."). That fix is still missing onrelease/0.9.0, where the user-reported bug was reproduced .Summary by CodeRabbit
Bug Fixes
Tests