Skip to content

Add LoRA fork weight loading (pre-transformers-v5 base)#654

Open
arcticfly wants to merge 1 commit into
mainfrom
fix/fork-on-pre-v5
Open

Add LoRA fork weight loading (pre-transformers-v5 base)#654
arcticfly wants to merge 1 commit into
mainfrom
fix/fork-on-pre-v5

Conversation

@arcticfly

Copy link
Copy Markdown
Collaborator

Summary

Adds the pieces needed for `backend._experimental_fork_checkpoint` to actually
load the forked LoRA weights into the trainer (rather than just copying the
checkpoint directory and letting `from_pretrained` initialize a fresh LoRA).

  • `UnslothState.load_lora_adapter(path)` — reads `adapter_model.safetensors` and applies it to the live peft model via `set_peft_model_state_dict`.
  • `UnslothService._forked_checkpoint_dir` — records the forked path so the first `_train_dedicated` / `_train_shared` call applies it.
  • `LocalBackend._experimental_fork_checkpoint` — invalidates the `_state` cache after `shutil.copytree` and records `_forked_checkpoint_dir` on the service.

Why the unusual base

This branch is based on commit `621e82b2` (last commit before the transformers-v5 upgrade in #629), not current main. On H200 + `load_in_4bit=True`, transformers v5 + Unsloth 2026.3.3 crash with `Half and BFloat16` in Unsloth's fused LoRA kernels on the first forward pass, before any rollouts. The v4 base avoids that.

Not expected to merge as-is — posting as a reference for the fork-weight-loading mechanics. Maintainers would likely want to:

  1. Resolve the v5 dtype mismatch upstream (possibly via Unsloth), then
  2. Cherry-pick the three pieces above onto main.

Test plan

  • End-to-end 20-step training on a forked `kl-000-1` checkpoint: checkpoint reloaded correctly across every step, `val/reward` started at ~0.86 (source-checkpoint quality, not raw-base-model quality).
  • End-to-end training without forking: unchanged behavior.
  • Maintainer review of whether this approach is the right shape for a forward-port.

🤖 Generated with Claude Code

Adds three pieces needed for LocalBackend._experimental_fork_checkpoint
to actually load the forked LoRA weights into the trainer:

1. UnslothState.load_lora_adapter — loads adapter_model.safetensors
   into the live peft model via set_peft_model_state_dict, replacing
   the freshly-initialized LoRA layers from from_pretrained.

2. UnslothService._forked_checkpoint_dir — stores the forked path so
   the first _train_dedicated / _train_shared call can apply it.

3. backend._experimental_fork_checkpoint — invalidates the _state cache
   after copytree, then records _forked_checkpoint_dir on the service.

Built on 621e82b (pre-transformers-v5) because v5 introduces a bf16/fp16
mismatch in Unsloth's fused LoRA kernels that crashes every forward pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant