fix: pass missing gts argument to _dump_generations call#528
Open
Alexi5000 wants to merge 2 commits into
Open
Conversation
The `RayPPOTrainer._dump_generations()` method requires a `gts` (ground truths) positional argument, but both `AgentLightningTrainer._train_step` and `EnvAgentLightningTrainer._train_step` omit it, causing a TypeError at runtime when `rollout_data_dir` is configured. Pass `gts=None` since ground truth is not available in agent mode training. Also remove a leftover `print(batch.batch.keys())` debug statement from both call sites. Fixes microsoft#492
Contributor
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
This PR aligns rollout generation dumping behavior across the two VERL trainer implementations by removing a stray debug print and passing an explicit gts argument into _dump_generations.
Changes:
- Removed
print(batch.batch.keys())debug output during rollout dumping. - Added
gts=Noneto_dump_generations(...)calls in both trainer implementations.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| contrib/agentlightning/contrib/algorithm/env_verl/trainer.py | Removes debug print and adds gts=None when dumping generations. |
| agentlightning/verl/trainer.py | Mirrors the same rollout dumping change for consistency. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Author
|
@microsoft-github-policy-service agree |
verl >=0.6.0 added a required `gts` (ground truths) parameter to `RayPPOTrainer._dump_generations()`, but the CI-pinned verl 0.5.0 has no such parameter. Passing `gts=None` unconditionally raised a TypeError on 0.5.0 (unexpected keyword argument 'gts'). Feature-detect the `_dump_generations` signature with `inspect` and only pass `gts=None` when the parameter exists, so the fix works on both verl 0.5.0 and 0.6.0. Ground truth is unavailable in agent-mode training. Applies to both AgentLightningTrainer._train_step and EnvAgentLightningTrainer._train_step.
Author
|
@microsoft-github-policy-service agree |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
TypeErrorin_dump_generations()that occurs whenrollout_data_diris configured, in bothAgentLightningTrainer._train_stepandEnvAgentLightningTrainer._train_stepprint(batch.batch.keys())debug statement from both call sitesProblem
RayPPOTrainer._dump_generations()is called from both trainer subclasses, but its signature changed between verl releases:examples-compat.yml):_dump_generations(self, inputs, outputs, scores, reward_extra_infos_dict, dump_path)— nogtsparameter_dump_generations(self, inputs, outputs, gts, scores, reward_extra_infos_dict, dump_path)— adds a requiredgts(ground truths) positional parameterThe original call sites omitted
gtsentirely, which raises aTypeErroron verl 0.6.0:Conversely, unconditionally passing
gts=Nonebreaks on verl 0.5.0:Ground truth is not available in agent-mode training, so
Noneis the correct value when the parameter exists.Fix
Feature-detect the
_dump_generationssignature withinspect.signature(...)and only passgts=Nonewhen thegtsparameter is present. This works on both verl 0.5.0 and 0.6.0:Fixes #492
Files Changed
agentlightning/verl/trainer.py—AgentLightningTrainer._train_stepcontrib/agentlightning/contrib/algorithm/env_verl/trainer.py—EnvAgentLightningTrainer._train_stepTest plan
rollout_data_dirconfigured on verl 0.5.0 — noTypeErrorrollout_data_dirconfigured on verl 0.6.0 — noTypeError,gts=Nonepassedrollout_data_dir— no behavior change