fix(tokenizer): hint transformers upgrade on missing class (#960)#971
Conversation
Try out this PRQuick install: pip install --upgrade --force-reinstall git+https://github.com/ai-dynamo/aiperf.git@91fec931a7266ab5df72a48ba4830dd82c735c81Recommended with virtual environment (using uv): uv venv --python 3.12 && source .venv/bin/activate
uv pip install --upgrade --force-reinstall git+https://github.com/ai-dynamo/aiperf.git@91fec931a7266ab5df72a48ba4830dd82c735c81Last updated for commit: |
a66995b to
aed74c9
Compare
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughDetects HuggingFace missing-tokenizer-class errors with regex, optionally suggests a transformers upgrade based on installed version, integrates the hint into ChangesTokenizer error hints
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
tests/unit/common/test_tokenizer_error_hints.py (1)
18-18: ⚡ Quick winRename test methods to match the required test naming convention.
At Line 18, Line 24, Line 27, Line 45, Line 56, and Line 70, test names should follow
test_<function>_<scenario>_<expected>for consistency and discoverability in this repo.As per coding guidelines,
tests/**/*.py: Test naming convention:test_<function>_<scenario>_<expected>.Also applies to: 24-24, 27-27, 45-45, 56-56, 70-70
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/unit/common/test_tokenizer_error_hints.py` at line 18, Rename the test methods in this file to follow the repository convention test_<function>_<scenario>_<expected>; e.g. change test_matches_hf_missing_class_error to a descriptive three-part name such as test_matches_hf_missing_class_error_when_model_missing_raises_hint (or similar) and apply the same pattern to the other test functions referenced in the review so each name clearly states the function under test, the scenario, and the expected outcome.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@tests/unit/common/test_tokenizer_error_hints.py`:
- Line 18: Rename the test methods in this file to follow the repository
convention test_<function>_<scenario>_<expected>; e.g. change
test_matches_hf_missing_class_error to a descriptive three-part name such as
test_matches_hf_missing_class_error_when_model_missing_raises_hint (or similar)
and apply the same pattern to the other test functions referenced in the review
so each name clearly states the function under test, the scenario, and the
expected outcome.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 803fbe3b-e68e-45d6-9253-bfaae6264ad3
📒 Files selected for processing (2)
src/aiperf/common/tokenizer.pytests/unit/common/test_tokenizer_error_hints.py
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
aed74c9 to
f113b44
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
tests/unit/common/test_tokenizer_error_hints.py (1)
18-83: ⚡ Quick winAlign test method names with the repo convention.
Several names in Line 18 through Line 83 don’t follow
test_<function>_<scenario>_<expected>, which makes intent less uniform across the suite. Please rename these tests to match the required pattern.As per coding guidelines:
tests/**/*.py: “Test naming convention:test_<function>_<scenario>_<expected>”.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/unit/common/test_tokenizer_error_hints.py` around lines 18 - 83, Rename the test functions to follow test_<function>_<scenario>_<expected> naming: in this file update _missing_tokenizer_class_hint tests — rename test_matches_hf_missing_class_error to test__missing_tokenizer_class_hint_hf_missing_class_error_matches_hint, test_returns_none_for_unrelated_error to test__missing_tokenizer_class_hint_unrelated_value_error_returns_none, and test_returns_none_for_repo_not_found_error to test__missing_tokenizer_class_hint_repo_not_found_oserror_returns_none; rename test_v5_suggestion_only_on_v4 to test__missing_tokenizer_class_hint_transformers_v4_shows_v5_suggestion_else_omits; and in class TestFromPretrainedWrapsHint rename test_wraps_missing_class_error_with_hint to test_Tokenizer_from_pretrained_missing_class_error_includes_hint and test_wraps_unrelated_error_without_hint to test_Tokenizer_from_pretrained_unrelated_error_omits_v5_suggestion so names reference the tested functions/methods ( _missing_tokenizer_class_hint and Tokenizer.from_pretrained / Tokenizer._load_from_hub ) and follow the repository convention.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@tests/unit/common/test_tokenizer_error_hints.py`:
- Around line 18-83: Rename the test functions to follow
test_<function>_<scenario>_<expected> naming: in this file update
_missing_tokenizer_class_hint tests — rename test_matches_hf_missing_class_error
to test__missing_tokenizer_class_hint_hf_missing_class_error_matches_hint,
test_returns_none_for_unrelated_error to
test__missing_tokenizer_class_hint_unrelated_value_error_returns_none, and
test_returns_none_for_repo_not_found_error to
test__missing_tokenizer_class_hint_repo_not_found_oserror_returns_none; rename
test_v5_suggestion_only_on_v4 to
test__missing_tokenizer_class_hint_transformers_v4_shows_v5_suggestion_else_omits;
and in class TestFromPretrainedWrapsHint rename
test_wraps_missing_class_error_with_hint to
test_Tokenizer_from_pretrained_missing_class_error_includes_hint and
test_wraps_unrelated_error_without_hint to
test_Tokenizer_from_pretrained_unrelated_error_omits_v5_suggestion so names
reference the tested functions/methods ( _missing_tokenizer_class_hint and
Tokenizer.from_pretrained / Tokenizer._load_from_hub ) and follow the repository
convention.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: caba1294-b9e9-4d95-9f3d-da06032d1ba8
📒 Files selected for processing (4)
src/aiperf/common/tokenizer.pysrc/aiperf/common/tokenizer_display.pytests/unit/common/test_tokenizer_display.pytests/unit/common/test_tokenizer_error_hints.py
lkomali
left a comment
There was a problem hiding this comment.
LGTM!
agree with dynamo-ops comment.
762ee2b to
19e46cc
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
tests/unit/common/test_tokenizer_display.py (1)
286-295: ⚡ Quick winAlign this parametrized block with repo test style conventions.
Please use
from pytest import paramand place# fmt: skipon the closing)line for this@pytest.mark.parametrizeblock.Proposed diff
import pytest +from pytest import param @@ `@pytest.mark.parametrize`( ("version", "expect_v5_upgrade"), [ - pytest.param("4.57.3", True, id="v4-suggests-upgrade"), - pytest.param("4.56.0", True, id="v4-floor-suggests-upgrade"), - pytest.param("5.0.1", False, id="v5-omits-suggestion"), - pytest.param("5.8.0", False, id="v5-omits-suggestion-later"), - pytest.param("<unknown>", False, id="unknown-omits-suggestion"), + param("4.57.3", True, id="v4-suggests-upgrade"), + param("4.56.0", True, id="v4-floor-suggests-upgrade"), + param("5.0.1", False, id="v5-omits-suggestion"), + param("5.8.0", False, id="v5-omits-suggestion-later"), + param("<unknown>", False, id="unknown-omits-suggestion"), ], - ) + ) # fmt: skipAs per coding guidelines,
tests/**/*.pyshould usefrom pytest import paramand include# fmt: skipon the)line for@pytest.mark.parametrize.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/unit/common/test_tokenizer_display.py` around lines 286 - 295, Replace usages of pytest.param in the `@pytest.mark.parametrize` block for ("version", "expect_v5_upgrade") with param from a top-level import (add "from pytest import param" to the test module) and change each pytest.param(...) to param(...); also add the comment "# fmt: skip" on the closing ")" line of that `@pytest.mark.parametrize` block to match test style conventions. Locate the decorator that lists versions "4.57.3", "4.56.0", "5.0.1", "5.8.0", "<unknown>" and update it accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@tests/unit/common/test_tokenizer_display.py`:
- Around line 286-295: Replace usages of pytest.param in the
`@pytest.mark.parametrize` block for ("version", "expect_v5_upgrade") with param
from a top-level import (add "from pytest import param" to the test module) and
change each pytest.param(...) to param(...); also add the comment "# fmt: skip"
on the closing ")" line of that `@pytest.mark.parametrize` block to match test
style conventions. Locate the decorator that lists versions "4.57.3", "4.56.0",
"5.0.1", "5.8.0", "<unknown>" and update it accordingly.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 0e16cb11-0002-4ede-8685-dc7f8cf346b9
📒 Files selected for processing (4)
src/aiperf/common/tokenizer.pysrc/aiperf/common/tokenizer_display.pytests/unit/common/test_tokenizer_display.pytests/unit/common/test_tokenizer_error_hints.py
Signed-off-by: Elias Bermudez <dbermudez@nvidia.com>
Signed-off-by: Elias Bermudez <dbermudez@nvidia.com>
Mirror the version-aware fallback handling from _missing_tokenizer_class_hint in the CLI panel: read transformers.__version__ with the same (ImportError, AttributeError) fallback to "<unknown>", and only surface the "pip install -U 'transformers>=5'" fix when the installed major version is 4. Surface the detected version in the cause text so users on v5/unknown see why no upgrade hint applies. Signed-off-by: Elias Bermudez <dbermudez@nvidia.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Elias Bermudez <dbermudez@nvidia.com>
Move the version-aware hint construction out of _detect_error to keep its cyclomatic complexity under the C901 threshold. Behavior is unchanged. Signed-off-by: Elias Bermudez <dbermudez@nvidia.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Elias Bermudez <dbermudez@nvidia.com>
Replace `import transformers; transformers.__version__` with
`importlib.metadata.version("transformers")` in
`_missing_tokenizer_class_insight`, falling back to "<unknown>" on
`PackageNotFoundError`. This avoids importing transformers from the
error-display path and reads the version straight from package metadata.
Tests now patch `_pkg_version` instead of `transformers.__version__`.
Signed-off-by: Elias Bermudez <dbermudez@nvidia.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Elias Bermudez <dbermudez@nvidia.com>
4ff7605 to
91fec93
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
tests/unit/common/test_tokenizer_display.py (1)
286-298: ⚡ Quick winAlign new parametrized test with repo pytest and typing conventions.
Use
from pytest import paramfor the table entries and add explicit parameter types on the new test method.Proposed patch
@@ -import pytest +import pytest +from pytest import param @@ - `@pytest.mark.parametrize`( + `@pytest.mark.parametrize`( ("version", "expect_v5_upgrade"), [ - pytest.param("4.57.3", True, id="v4-suggests-upgrade"), - pytest.param("4.56.0", True, id="v4-floor-suggests-upgrade"), - pytest.param("5.0.1", False, id="v5-omits-suggestion"), - pytest.param("5.8.0", False, id="v5-omits-suggestion-later"), - pytest.param("<unknown>", False, id="unknown-omits-suggestion"), + param("4.57.3", True, id="v4-suggests-upgrade"), + param("4.56.0", True, id="v4-floor-suggests-upgrade"), + param("5.0.1", False, id="v5-omits-suggestion"), + param("5.8.0", False, id="v5-omits-suggestion-later"), + param("<unknown>", False, id="unknown-omits-suggestion"), ], - ) + ) # fmt: skip def test_missing_tokenizer_class_upgrade_only_on_v4( - self, console_output, version, expect_v5_upgrade - ): + self, + console_output: tuple[object, object], + version: str, + expect_v5_upgrade: bool, + ) -> None:As per coding guidelines: "
tests/**/*.py: Usefrom pytest import paramand put# fmt: skipon the)line for parametrized tests" and "**/*.py: Include type hints on ALL functions (params and return)`."🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/unit/common/test_tokenizer_display.py` around lines 286 - 298, The parametrized test uses pytest.param and lacks type hints and the required formatter directive; update the decorator table to use param (add from pytest import param at top) and place "# fmt: skip" on the closing ")" of the `@pytest.mark.parametrize` block, and add explicit type annotations for all function parameters and the return type on test_missing_tokenizer_class_upgrade_only_on_v4 (e.g., annotate console_output, version, expect_v5_upgrade and -> None) so the test follows the repo pytest/typing conventions.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@tests/unit/common/test_tokenizer_display.py`:
- Around line 286-298: The parametrized test uses pytest.param and lacks type
hints and the required formatter directive; update the decorator table to use
param (add from pytest import param at top) and place "# fmt: skip" on the
closing ")" of the `@pytest.mark.parametrize` block, and add explicit type
annotations for all function parameters and the return type on
test_missing_tokenizer_class_upgrade_only_on_v4 (e.g., annotate console_output,
version, expect_v5_upgrade and -> None) so the test follows the repo
pytest/typing conventions.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 3c26e3ca-1b5c-46ba-84e2-ed20473243f4
📒 Files selected for processing (4)
src/aiperf/common/tokenizer.pysrc/aiperf/common/tokenizer_display.pytests/unit/common/test_tokenizer_display.pytests/unit/common/test_tokenizer_error_hints.py
Summary by CodeRabbit
Example error: