fix: model name for plot#998
Conversation
Try out this PRQuick install: pip install --upgrade --force-reinstall git+https://github.com/ai-dynamo/aiperf.git@d76516b9ab80794a4510dc75a3585dfcd7a757e4Recommended with virtual environment (using uv): uv venv --python 3.12 && source .venv/bin/activate
uv pip install --upgrade --force-reinstall git+https://github.com/ai-dynamo/aiperf.git@d76516b9ab80794a4510dc75a3585dfcd7a757e4Last updated for commit: |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
WalkthroughAdd DataLoader._extract_model_name to resolve model names preferring YAML v2 structure, update _extract_metadata to use it, refactor Parquet labels_json generation to serialize only non-empty/non-NaN labels, and add four unit tests covering model extraction precedence and fallback. ChangesModel extraction with YAML v2 precedence and label JSON refactoring
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/aiperf/plot/core/data_loader.py`:
- Around line 1624-1636: The extract_model logic currently trusts
items[0].get("name") and endpoint.get("model_names") without type checks; update
the checks inside extract_model to validate types: for the YAML v2 path ensure
items is list, items[0] is dict, then set name = items[0].get("name") and only
return it if isinstance(name, str) and name (non-empty); for the legacy path
ensure names = endpoint.get("model_names") is a list and that names[0] is a
non-empty string (isinstance(names[0], str) and names[0]) before returning
names[0]; keep existing flow otherwise.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 758a2f14-48a6-4dba-b49e-dc00a697d010
📒 Files selected for processing (2)
src/aiperf/plot/core/data_loader.pytests/unit/plot/test_data_loader.py
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
1166a71 to
d76516b
Compare
The aggregate JSON written per sweep cell (`profile_export_aiperf_aggregate.json`) carries no `input_config` block, so the plot loader resolved `model = None` on those runs and the legend rendered "Unknown" — even after the YAML-v2/legacy fallback fix in #998, which only addresses the per-trial path. Write side (`_sweep_aggregate._export_one_variation_aggregate`): resolve the model from `plan.configs[variation.index].models.items[0]`, matching by label against the variation key. Stamp the result onto `aggregate_result.metadata["model"]` (alongside `variation_label`, `variation_values`, `sweep_mode`) so the aggregate file becomes self-describing. Fall back to `configs[0]` when the label is unmatched. Read side (`DataLoader._extract_metadata`): after the input_config lookup, if `model` is still None, pick it up from `aggregated["metadata"]["model"]`. Per-trial artifacts still win when both shapes are present. Signed-off-by: Elias Bermudez <dbermudez@nvidia.com>
YAML v2 artifacts leave ``input_config.loadgen`` as ``null`` and attach per-phase load characteristics (``concurrency``, ``requests``) to the ``input_config.phases[]`` entries instead. The plot loader still only checked ``loadgen.concurrency`` / ``loadgen.request_count``, so v2 artifacts came back with ``RunMetadata.concurrency = None``. In ``MultiRunPNGExporter._runs_to_dataframe`` the ``or 1`` fallback collapsed every row to ``concurrency = 1``, fusing every cell into a single uncertainty point instead of one per swept value. Same class of shape-mismatch as the model name fix in #998: add a small helper that prefers the v2 ``phases[].{concurrency, requests}`` (read from the ``profiling`` phase when present, else the first phase) and falls back to the legacy ``loadgen.{concurrency, request_count}``. Defensive: ``loadgen`` may be the literal ``null`` (None) in v2 JSON, which would have crashed the prior ``"x" in config["loadgen"]`` form if a future change stops eliding null fields at serialization time. Signed-off-by: Elias Bermudez <dbermudez@nvidia.com>
The aggregate JSON written per sweep cell (`profile_export_aiperf_aggregate.json`) carries no `input_config` block, so the plot loader resolved `model = None` on those runs and the legend rendered "Unknown" — even after the YAML-v2/legacy fallback fix in #998, which only addresses the per-trial path. Write side (`_sweep_aggregate._export_one_variation_aggregate`): resolve the model from `plan.configs[variation.index].models.items[0]`, matching by label against the variation key. Stamp the result onto `aggregate_result.metadata["model"]` (alongside `variation_label`, `variation_values`, `sweep_mode`) so the aggregate file becomes self-describing. Fall back to `configs[0]` when the label is unmatched. Read side (`DataLoader._extract_metadata`): after the input_config lookup, if `model` is still None, pick it up from `aggregated["metadata"]["model"]`. Per-trial artifacts still win when both shapes are present. Signed-off-by: Elias Bermudez <dbermudez@nvidia.com>
YAML v2 artifacts leave ``input_config.loadgen`` as ``null`` and attach per-phase load characteristics (``concurrency``, ``requests``) to the ``input_config.phases[]`` entries instead. The plot loader still only checked ``loadgen.concurrency`` / ``loadgen.request_count``, so v2 artifacts came back with ``RunMetadata.concurrency = None``. In ``MultiRunPNGExporter._runs_to_dataframe`` the ``or 1`` fallback collapsed every row to ``concurrency = 1``, fusing every cell into a single uncertainty point instead of one per swept value. Same class of shape-mismatch as the model name fix in #998: add a small helper that prefers the v2 ``phases[].{concurrency, requests}`` (read from the ``profiling`` phase when present, else the first phase) and falls back to the legacy ``loadgen.{concurrency, request_count}``. Defensive: ``loadgen`` may be the literal ``null`` (None) in v2 JSON, which would have crashed the prior ``"x" in config["loadgen"]`` form if a future change stops eliding null fields at serialization time. Signed-off-by: Elias Bermudez <dbermudez@nvidia.com>
Summary by CodeRabbit