SDPA tests: select attn output by shape by JulianCloudNTH · Pull Request #20285 · pytorch/executorch

JulianCloudNTH · 2026-06-15T19:42:35Z

Summary:
Follow-up to D108625761 (#20283), which fixed test_sdpa_config's ambiguous attention-output selection; this handles the remaining edge cases in the sibling SDPA test functions.

sdpa_with_kv_cache returns [k_cache, v_cache, attn_output], where the attention output is [1, S, Hq, D] and each KV cache is [1, Cmax, Hkv, D]. Selecting the attention output by flat element count is ambiguous whenever S*Hq == Cmax*Hkv (all three tensors then share a numel) — the deterministic llama1b_prefill failure D108625761 fixed in test_sdpa_config. The three other SDPA test functions — test_sdpa_replay, test_sdpa_dynamic_decode, and test_sdpa_incache_decode — still matched by numel; no currently-configured sequence triggers the collision, but they carry the same latent ambiguity. This applies the same shape-based selection: match the attention output as the 4-D tensor [1, S, Hq, D] and (for replay and dynamic decode) classify the two caches as [1, Cmax, Hkv, D], keeping the per-step k-vs-v identification by content unchanged. Test-only; no kernel, runtime, or export change.

Authored with Claude Code.

Differential Revision: D108650388

Summary: Follow-up to D108625761 (pytorch#20283), which fixed `test_sdpa_config`'s ambiguous attention-output selection; this handles the remaining edge cases in the sibling SDPA test functions. `sdpa_with_kv_cache` returns `[k_cache, v_cache, attn_output]`, where the attention output is `[1, S, Hq, D]` and each KV cache is `[1, Cmax, Hkv, D]`. Selecting the attention output by flat element count is ambiguous whenever `S*Hq == Cmax*Hkv` (all three tensors then share a `numel`) — the deterministic `llama1b_prefill` failure D108625761 fixed in `test_sdpa_config`. The three other SDPA test functions — `test_sdpa_replay`, `test_sdpa_dynamic_decode`, and `test_sdpa_incache_decode` — still matched by `numel`; no currently-configured sequence triggers the collision, but they carry the same latent ambiguity. This applies the same shape-based selection: match the attention output as the 4-D tensor `[1, S, Hq, D]` and (for replay and dynamic decode) classify the two caches as `[1, Cmax, Hkv, D]`, keeping the per-step k-vs-v identification by content unchanged. Test-only; no kernel, runtime, or export change. Authored with Claude Code. Differential Revision: D108650388

pytorch-bot · 2026-06-15T19:42:39Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/20285

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 3 New Failures, 4 Pending

As of commit 2762587 with merge base 99ca02f ():

NEW FAILURES - The following jobs have failed:

Cadence Build & Test / hifi-build / hifi4 (gh)
Input required and not supplied: aws-region
Cadence Build & Test / vision-build / vision (gh)
Input required and not supplied: aws-region
pull / test-sqnr-static-llm-qnn-linux (smollm2_135m) / linux-job (gh)
RuntimeError: Command docker exec -t 49c2b4c0a143ed88eb7d148c5cf96096503462415ebb51b89cb098627ad57926 /exec failed with exit code 92

This comment was automatically generated by Dr. CI and updates every 15 minutes.

meta-codesync · 2026-06-15T19:42:45Z

@JulianCloudNTH has exported this pull request. If you are a Meta employee, you can view the originating Diff in D108650388.

github-actions · 2026-06-15T19:43:30Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 15, 2026

meta-codesync Bot added the meta-exported label Jun 15, 2026

JulianCloudNTH had a problem deploying to cadence June 15, 2026 19:42 — with GitHub Actions Failure

JulianCloudNTH requested a review from shoumikhin June 15, 2026 19:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SDPA tests: select attn output by shape#20285

SDPA tests: select attn output by shape#20285
JulianCloudNTH wants to merge 1 commit into
pytorch:mainfrom
JulianCloudNTH:export-D108650388

JulianCloudNTH commented Jun 15, 2026

Uh oh!

pytorch-bot Bot commented Jun 15, 2026 •

edited

Loading

Uh oh!

meta-codesync Bot commented Jun 15, 2026

Uh oh!

github-actions Bot commented Jun 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

JulianCloudNTH commented Jun 15, 2026

Uh oh!

pytorch-bot Bot commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/20285

❌ 3 New Failures, 4 Pending

Uh oh!

meta-codesync Bot commented Jun 15, 2026

Uh oh!

github-actions Bot commented Jun 15, 2026

This PR needs a release notes: label

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

pytorch-bot Bot commented Jun 15, 2026 •

edited

Loading

This PR needs a `release notes:` label