Skip to content

fix: pass missing gts argument to _dump_generations call#528

Open
Alexi5000 wants to merge 2 commits into
microsoft:mainfrom
Alexi5000:pr/fix-dump-generations-missing-gts
Open

fix: pass missing gts argument to _dump_generations call#528
Alexi5000 wants to merge 2 commits into
microsoft:mainfrom
Alexi5000:pr/fix-dump-generations-missing-gts

Conversation

@Alexi5000

@Alexi5000 Alexi5000 commented May 21, 2026

Copy link
Copy Markdown

Summary

  • Fix a runtime TypeError in _dump_generations() that occurs when rollout_data_dir is configured, in both AgentLightningTrainer._train_step and EnvAgentLightningTrainer._train_step
  • Make the fix version-robust across verl 0.5.0 and 0.6.0 (see Problem below)
  • Remove a leftover print(batch.batch.keys()) debug statement from both call sites

Note: This PR supersedes the duplicate PRs #493 and #522, which addressed the same bug. This version additionally handles the verl 0.5.0 / 0.6.0 signature difference so it does not break the CI-pinned verl 0.5.0.

Problem

RayPPOTrainer._dump_generations() is called from both trainer subclasses, but its signature changed between verl releases:

  • verl 0.5.0 (CI-pinned in examples-compat.yml): _dump_generations(self, inputs, outputs, scores, reward_extra_infos_dict, dump_path)no gts parameter
  • verl >=0.6.0: _dump_generations(self, inputs, outputs, gts, scores, reward_extra_infos_dict, dump_path) — adds a required gts (ground truths) positional parameter

The original call sites omitted gts entirely, which raises a TypeError on verl 0.6.0:

TypeError: RayPPOTrainer._dump_generations() missing 1 required positional argument: 'gts'

Conversely, unconditionally passing gts=None breaks on verl 0.5.0:

TypeError: _dump_generations() got an unexpected keyword argument 'gts'

Ground truth is not available in agent-mode training, so None is the correct value when the parameter exists.

Fix

Feature-detect the _dump_generations signature with inspect.signature(...) and only pass gts=None when the gts parameter is present. This works on both verl 0.5.0 and 0.6.0:

dump_kwargs = dict(
    inputs=inputs,
    outputs=outputs,
    scores=scores,
    reward_extra_infos_dict=reward_extra_infos_dict,
    dump_path=rollout_data_dir,
)
if "gts" in inspect.signature(self._dump_generations).parameters:
    dump_kwargs["gts"] = None
self._dump_generations(**dump_kwargs)

Fixes #492

Files Changed

  • agentlightning/verl/trainer.pyAgentLightningTrainer._train_step
  • contrib/agentlightning/contrib/algorithm/env_verl/trainer.pyEnvAgentLightningTrainer._train_step

Test plan

  • Run training with rollout_data_dir configured on verl 0.5.0 — no TypeError
  • Run training with rollout_data_dir configured on verl 0.6.0 — no TypeError, gts=None passed
  • Run training without rollout_data_dir — no behavior change

The `RayPPOTrainer._dump_generations()` method requires a `gts` (ground
truths) positional argument, but both `AgentLightningTrainer._train_step`
and `EnvAgentLightningTrainer._train_step` omit it, causing a TypeError
at runtime when `rollout_data_dir` is configured.

Pass `gts=None` since ground truth is not available in agent mode
training. Also remove a leftover `print(batch.batch.keys())` debug
statement from both call sites.

Fixes microsoft#492
Copilot AI review requested due to automatic review settings May 21, 2026 22:02

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

This PR aligns rollout generation dumping behavior across the two VERL trainer implementations by removing a stray debug print and passing an explicit gts argument into _dump_generations.

Changes:

  • Removed print(batch.batch.keys()) debug output during rollout dumping.
  • Added gts=None to _dump_generations(...) calls in both trainer implementations.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
contrib/agentlightning/contrib/algorithm/env_verl/trainer.py Removes debug print and adds gts=None when dumping generations.
agentlightning/verl/trainer.py Mirrors the same rollout dumping change for consistency.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread agentlightning/verl/trainer.py Outdated
Comment thread contrib/agentlightning/contrib/algorithm/env_verl/trainer.py Outdated
@Alexi5000

Copy link
Copy Markdown
Author

@microsoft-github-policy-service agree

verl >=0.6.0 added a required `gts` (ground truths) parameter to
`RayPPOTrainer._dump_generations()`, but the CI-pinned verl 0.5.0 has no
such parameter. Passing `gts=None` unconditionally raised a TypeError on
0.5.0 (unexpected keyword argument 'gts').

Feature-detect the `_dump_generations` signature with `inspect` and only
pass `gts=None` when the parameter exists, so the fix works on both verl
0.5.0 and 0.6.0. Ground truth is unavailable in agent-mode training.

Applies to both AgentLightningTrainer._train_step and
EnvAgentLightningTrainer._train_step.
@Alexi5000

Copy link
Copy Markdown
Author

@microsoft-github-policy-service agree

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

RayPPOTrainer._dump_generations() missing gts argument in _train_step call

2 participants