fix: model name for plot by debermudez · Pull Request #998 · ai-dynamo/aiperf

debermudez · 2026-05-26T23:21:49Z

Summary by CodeRabbit

Improvements
- Improved data loading: labels serialized only when present and model name resolution now prefers the newer config format with reliable fallback to legacy formats.
Tests
- Added unit tests verifying model name extraction across new and legacy configuration scenarios, including precedence and empty-value handling.

github-actions · 2026-05-26T23:21:59Z

Try out this PR

Quick install:

pip install --upgrade --force-reinstall git+https://github.com/ai-dynamo/aiperf.git@d76516b9ab80794a4510dc75a3585dfcd7a757e4

Recommended with virtual environment (using uv):

uv venv --python 3.12 && source .venv/bin/activate
uv pip install --upgrade --force-reinstall git+https://github.com/ai-dynamo/aiperf.git@d76516b9ab80794a4510dc75a3585dfcd7a757e4

Last updated for commit: d76516b • Browse code

coderabbitai · 2026-05-26T23:26:14Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 2a35d31a-350c-4b01-8103-6c0a1df136ec

📥 Commits

Reviewing files that changed from the base of the PR and between 1166a71 and d76516b.

📒 Files selected for processing (2)

src/aiperf/plot/core/data_loader.py
tests/unit/plot/test_data_loader.py

🚧 Files skipped from review as they are similar to previous changes (2)

src/aiperf/plot/core/data_loader.py
tests/unit/plot/test_data_loader.py

Walkthrough

Add DataLoader._extract_model_name to resolve model names preferring YAML v2 structure, update _extract_metadata to use it, refactor Parquet labels_json generation to serialize only non-empty/non-NaN labels, and add four unit tests covering model extraction precedence and fallback.

Changes

Model extraction with YAML v2 precedence and label JSON refactoring

Layer / File(s)	Summary
Label JSON construction refactor `src/aiperf/plot/core/data_loader.py`	Parquet loader's `labels_json` generation rewritten to filter non-NaN, non-empty label columns and serialize with sorted keys when labels exist, otherwise return `"{}"`.
Model name extraction helper and metadata integration `src/aiperf/plot/core/data_loader.py`	New `_extract_model_name(config)` prefers YAML v2 `models.items[0].name` over legacy `endpoint.model_names[0]`. `_extract_metadata` now derives `model` by calling this helper.
Model extraction behavior and precedence tests `tests/unit/plot/test_data_loader.py`	Four tests added to assert model is taken from YAML v2 when present, YAML v2 takes precedence over legacy, legacy is used when YAML v2 is absent, and an empty `models.items` does not override a valid legacy `endpoint.model_names`.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐇 I hop through configs, tidy and neat,
YAML v2 whispers the model to meet.
Legacy waits if the new tree is bare,
Labels sorted, empty ones spared.
Tests verify each careful affair.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'fix: model name for plot' directly relates to the main change: updating model name extraction logic in the DataLoader to handle both YAML v2 and legacy formats, with corresponding test additions.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/aiperf/plot/core/data_loader.py`:
- Around line 1624-1636: The extract_model logic currently trusts
items[0].get("name") and endpoint.get("model_names") without type checks; update
the checks inside extract_model to validate types: for the YAML v2 path ensure
items is list, items[0] is dict, then set name = items[0].get("name") and only
return it if isinstance(name, str) and name (non-empty); for the legacy path
ensure names = endpoint.get("model_names") is a list and that names[0] is a
non-empty string (isinstance(names[0], str) and names[0]) before returning
names[0]; keep existing flow otherwise.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 758a2f14-48a6-4dba-b49e-dc00a697d010

📥 Commits

Reviewing files that changed from the base of the PR and between 03167d0 and 1166a71.

📒 Files selected for processing (2)

src/aiperf/plot/core/data_loader.py
tests/unit/plot/test_data_loader.py

codecov · 2026-05-26T23:31:09Z

Codecov Report

❌ Patch coverage is 92.85714% with 1 line in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
src/aiperf/plot/core/data_loader.py	92.85%	0 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

…1001)

The aggregate JSON written per sweep cell (`profile_export_aiperf_aggregate.json`) carries no `input_config` block, so the plot loader resolved `model = None` on those runs and the legend rendered "Unknown" — even after the YAML-v2/legacy fallback fix in #998, which only addresses the per-trial path. Write side (`_sweep_aggregate._export_one_variation_aggregate`): resolve the model from `plan.configs[variation.index].models.items[0]`, matching by label against the variation key. Stamp the result onto `aggregate_result.metadata["model"]` (alongside `variation_label`, `variation_values`, `sweep_mode`) so the aggregate file becomes self-describing. Fall back to `configs[0]` when the label is unmatched. Read side (`DataLoader._extract_metadata`): after the input_config lookup, if `model` is still None, pick it up from `aggregated["metadata"]["model"]`. Per-trial artifacts still win when both shapes are present. Signed-off-by: Elias Bermudez <dbermudez@nvidia.com>

YAML v2 artifacts leave ``input_config.loadgen`` as ``null`` and attach per-phase load characteristics (``concurrency``, ``requests``) to the ``input_config.phases[]`` entries instead. The plot loader still only checked ``loadgen.concurrency`` / ``loadgen.request_count``, so v2 artifacts came back with ``RunMetadata.concurrency = None``. In ``MultiRunPNGExporter._runs_to_dataframe`` the ``or 1`` fallback collapsed every row to ``concurrency = 1``, fusing every cell into a single uncertainty point instead of one per swept value. Same class of shape-mismatch as the model name fix in #998: add a small helper that prefers the v2 ``phases[].{concurrency, requests}`` (read from the ``profiling`` phase when present, else the first phase) and falls back to the legacy ``loadgen.{concurrency, request_count}``. Defensive: ``loadgen`` may be the literal ``null`` (None) in v2 JSON, which would have crashed the prior ``"x" in config["loadgen"]`` form if a future change stops eliding null fields at serialization time. Signed-off-by: Elias Bermudez <dbermudez@nvidia.com>

The aggregate JSON written per sweep cell (`profile_export_aiperf_aggregate.json`) carries no `input_config` block, so the plot loader resolved `model = None` on those runs and the legend rendered "Unknown" — even after the YAML-v2/legacy fallback fix in #998, which only addresses the per-trial path. Write side (`_sweep_aggregate._export_one_variation_aggregate`): resolve the model from `plan.configs[variation.index].models.items[0]`, matching by label against the variation key. Stamp the result onto `aggregate_result.metadata["model"]` (alongside `variation_label`, `variation_values`, `sweep_mode`) so the aggregate file becomes self-describing. Fall back to `configs[0]` when the label is unmatched. Read side (`DataLoader._extract_metadata`): after the input_config lookup, if `model` is still None, pick it up from `aggregated["metadata"]["model"]`. Per-trial artifacts still win when both shapes are present. Signed-off-by: Elias Bermudez <dbermudez@nvidia.com>

YAML v2 artifacts leave ``input_config.loadgen`` as ``null`` and attach per-phase load characteristics (``concurrency``, ``requests``) to the ``input_config.phases[]`` entries instead. The plot loader still only checked ``loadgen.concurrency`` / ``loadgen.request_count``, so v2 artifacts came back with ``RunMetadata.concurrency = None``. In ``MultiRunPNGExporter._runs_to_dataframe`` the ``or 1`` fallback collapsed every row to ``concurrency = 1``, fusing every cell into a single uncertainty point instead of one per swept value. Same class of shape-mismatch as the model name fix in #998: add a small helper that prefers the v2 ``phases[].{concurrency, requests}`` (read from the ``profiling`` phase when present, else the first phase) and falls back to the legacy ``loadgen.{concurrency, request_count}``. Defensive: ``loadgen`` may be the literal ``null`` (None) in v2 JSON, which would have crashed the prior ``"x" in config["loadgen"]`` form if a future change stops eliding null fields at serialization time. Signed-off-by: Elias Bermudez <dbermudez@nvidia.com>

github-actions Bot added the fix label May 26, 2026

coderabbitai Bot reviewed May 26, 2026

View reviewed changes

Comment thread src/aiperf/plot/core/data_loader.py

dynamo-ops approved these changes May 26, 2026

View reviewed changes

FrankD412 approved these changes May 27, 2026

View reviewed changes

debermudez added 2 commits May 26, 2026 17:40

Fix model name for plot

c272bb9

Add inline comment

d76516b

debermudez force-pushed the fix-plot-data-loader branch from 1166a71 to d76516b Compare May 27, 2026 00:41

debermudez enabled auto-merge (squash) May 27, 2026 00:46

debermudez merged commit f070f5a into main May 27, 2026
25 checks passed

debermudez deleted the fix-plot-data-loader branch May 27, 2026 00:46

debermudez mentioned this pull request May 27, 2026

fix(plot): cherry-pick #998 (model name for plot) to release/0.9.0 #1001

Merged

2 tasks

saturley-hall pushed a commit that referenced this pull request May 27, 2026

fix(plot): cherry-pick #998 (model name for plot) to release/0.9.0 (#…

0c73ad0

…1001)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: model name for plot#998

fix: model name for plot#998
debermudez merged 2 commits into
mainfrom
fix-plot-data-loader

debermudez commented May 26, 2026 •

edited by coderabbitai Bot

Loading

github-actions Bot commented May 26, 2026 •

edited

Loading

coderabbitai Bot commented May 26, 2026 •

edited

Loading

coderabbitai Bot left a comment

Uh oh!

codecov Bot commented May 26, 2026

Uh oh!

Labels

3 participants

Uh oh!

Conversation

debermudez commented May 26, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

github-actions Bot commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Try out this PR

coderabbitai Bot commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

codecov Bot commented May 26, 2026

Codecov Report

Uh oh!

Labels

3 participants

debermudez commented May 26, 2026 •

edited by coderabbitai Bot

Loading

github-actions Bot commented May 26, 2026 •

edited

Loading

coderabbitai Bot commented May 26, 2026 •

edited

Loading