[Iluvatar] Support CINN for PaddleOCR-VL by converting max_seqlens to Tensor inputs by wuyujiji · Pull Request #7997 · PaddlePaddle/FastDeploy

wuyujiji · 2026-06-04T03:33:13Z

Motivation

天数硬件paddleocr-vl支持CINN

Modifications

N/A

Usage or Command

N/A

Accuracy Tests

N/A

Checklist

Add at least a tag in the PR title.
- Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
- You can add new tags based on the PR content, but the semantics must be clear.
Format your code, run pre-commit before commit.
Add unit tests. Please write the reason in this PR if no unit tests.
Provide accuracy results.
If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

codecov-commenter · 2026-06-04T04:32:14Z

Codecov Report

❌ Patch coverage is 0% with 4 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (develop@acd5638). Learn more about missing BASE report.

Files with missing lines	Patch %	Lines
...eploy/model_executor/ops/iluvatar/attention_ops.py	0.00%	4 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             develop    #7997   +/-   ##
==========================================
  Coverage           ?   67.79%           
==========================================
  Files              ?      475           
  Lines              ?    66613           
  Branches           ?    10261           
==========================================
  Hits               ?    45163           
  Misses             ?    18566           
  Partials           ?     2884

Flag	Coverage Δ
GPU	`77.83% <ø> (?)`
XPU	`6.98% <0.00%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

PaddlePaddle-bot · 2026-06-06T04:10:13Z

🤖 Paddle-CI-Agent | ci_status_monitor | 2026-06-14 12:03:25

CI报告基于以下代码生成（30分钟更新一次）:
PR commit: 08634d2 | Merge base: acd5638 (branch: develop)

1 Required任务 : 10/10 通过

总执行（rerun次数）	总任务	✅ 通过	❌ 失败	⏳ 运行中	⏸️ 等待中	跳过
42(0)	42	40	2	0	0	0

任务	错误类型	置信度	日志
无	无	无	无

2 失败详情

无

… Tensor inputs

PaddlePaddle-bot

🤖 Paddle-CI-Agent | pr_review | 2026-06-12 09:58:36

📋 Review 摘要

PR 概述：将 Iluvatar cuinfer_flash_attn_unpadded 的 max_seqlens_q/k 从 scalar attr 改为 Tensor input，并同步 PaddleOCR-VL CINN 文档与 CI 命令。
变更范围：Iluvatar custom op / PaddleOCR-VL attention 调用 / Iluvatar 文档与 CI 脚本
影响面 Tag：[OP] [Iluvatar] [Graph Optimization] [Docs] [CI]

问题

未发现新的阻塞性问题。PR 规范问题在下面章节报，不在这里重复。

历史 Findings 修复情况

Finding	问题	状态
F1	`max_seqlens_q/k` Tensor 解引用前缺少 `numel` 检查	⚠️ 仍存在
F2	attention forward 热路径中调用 `paddle.to_tensor`	⚠️ 仍存在

📝 PR 规范检查

Modifications、Usage or Command、Accuracy Tests 三节均填写 "Pass"，未提供实质内容。建议按模板补全。

标题建议（可直接复制）：

[Iluvatar] Support CINN for PaddleOCR-VL by converting max_seqlens to Tensor inputs

PR 描述建议（点击展开，可直接复制）

## Motivation
天数智芯（Iluvatar）平台 `cuinfer_flash_attn_unpadded` 算子原先将 `max_seqlens_q/k` 注册为 scalar attr，导致 CINN 无法处理动态序列长度。本 PR 将其改为 Tensor input，使 PaddleOCR-VL 在 Iluvatar 硬件上可启用 CINN（`graph_opt_level: 2`）。

## Modifications
- `custom_ops/iluvatar_ops/flash_attn_unpadded.cu`：
  - `FlashAttnUnpaddedKernel` / `FlashAttnUnpadded` 函数签名：`int max_seqlens_q/k` → `const paddle::Tensor& max_seqlens_q_/k_`
  - `PD_BUILD_STATIC_OP`：将 `max_seqlens_q/k` 从 `.Attrs` 移至 `.Inputs`
  - `FlashAttnUnpaddedInferShape` / `FlashAttnUnpaddedInferDtype`：新增对应入参
- `custom_ops/setup_ops.py`：Iluvatar 编译标志追加 `-std=c++17`
- `docs/`：更新容器名称、挂载路径及启动命令参数（`max-num-seqs: 240`、`gpu-memory-utilization: 0.7`、`graph_opt_level: 2`）
- `scripts/run_ci_iluvatar.sh`：CI 脚本同步更新 `graph-optimization-config`

## Usage or Command
```bash
python3 -m fastdeploy.entrypoints.openai.api_server \
    --model /data1/fastdeploy/PaddleOCR-VL \
    --max-model-len 16384 \
    --max-num-batched-tokens 16384 \
    --max-num-seqs 240 \
    --block-size 16 \
    --workers 2 \
    --gpu-memory-utilization 0.7 \
    --graph-optimization-config '{"graph_opt_level":2, "use_cudagraph": true}'
```

## Accuracy Tests
在 Iluvatar 硬件上测试 PaddleOCR-VL 推理精度与启用 CINN 前一致（或附具体指标）。

## Checklist

- [x] Add at least a tag in the PR title.
  - Tag list: [`[FDConfig]`,`[APIServer]`,`[Engine]`, `[Scheduler]`, `[PD Disaggregation]`, `[Executor]`, `[Graph Optimization]`, `[Speculative Decoding]`, `[RL]`, `[Models]`, `[Quantization]`, `[Loader]`, `[OP]`, `[KVCache]`, `[DataProcessor]`, `[BugFix]`, `[Docs]`, `[CI]`, `[Optimization]`, `[Feature]`, `[Benchmark]`, `[Others]`, `[XPU]`, `[HPU]`, `[GCU]`, `[DCU]`, `[Iluvatar]`, `[Metax]`]
  - You can add new tags based on the PR content, but the semantics must be clear.
- [x] Format your code, run `pre-commit` before commit.
- [ ] Add unit tests. Please write the reason in this PR if no unit tests.
- [ ] Provide accuracy results.
- [ ] If the current PR is submitting to the `release` branch, make sure the PR has been submitted to the `develop` branch, then cherry-pick it to the `release` branch with the `[Cherry-Pick]` PR tag.

总体评价

本轮按风险优先回溯了 PaddleOCR-VL 视觉 encoder 到 Iluvatar custom op 的调用链，max_seqlen Tensor 来源与 CPU Tensor 约束一致；未确认到新的可行内评论问题。已有两个历史建议仍未修复，建议后续一并处理边界校验和热路径 Tensor 构造。

wuyujiji had a problem deploying to Metax_ci June 4, 2026 03:33 — with GitHub Actions Failure

This comment was marked as outdated.

Sign in to view

wuyujiji force-pushed the yuzhe_dev branch from c97cd8c to a9f1881 Compare June 4, 2026 06:31

wuyujiji had a problem deploying to Metax_ci June 4, 2026 06:31 — with GitHub Actions Failure

This comment was marked as outdated.

Sign in to view

wuyujiji force-pushed the yuzhe_dev branch from a9f1881 to 9477f85 Compare June 11, 2026 03:43

wuyujiji had a problem deploying to Metax_ci June 11, 2026 03:43 — with GitHub Actions Failure

This comment was marked as outdated.

Sign in to view

wuyujiji force-pushed the yuzhe_dev branch from 9477f85 to e9ec41a Compare June 11, 2026 09:33

wuyujiji had a problem deploying to Metax_ci June 11, 2026 09:33 — with GitHub Actions Failure

wuyujiji changed the title ~~[Iluvatar] Support CINN for paddleocr-vl~~ [Iluvatar] Support CINN for PaddleOCR-VL by converting max_seqlens to Tensor inputs Jun 11, 2026

This comment was marked as outdated.

Sign in to view

[Iluvatar] Support CINN for PaddleOCR-VL by converting max_seqlens to…

08634d2

… Tensor inputs

wuyujiji force-pushed the yuzhe_dev branch from e9ec41a to 08634d2 Compare June 12, 2026 01:50

wuyujiji had a problem deploying to Metax_ci June 12, 2026 01:50 — with GitHub Actions Failure

PaddlePaddle-bot reviewed Jun 12, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Iluvatar] Support CINN for PaddleOCR-VL by converting max_seqlens to Tensor inputs#7997

[Iluvatar] Support CINN for PaddleOCR-VL by converting max_seqlens to Tensor inputs#7997
wuyujiji wants to merge 1 commit into
PaddlePaddle:developfrom
wuyujiji:yuzhe_dev

wuyujiji commented Jun 4, 2026 •

edited

Loading

Uh oh!

This comment was marked as outdated.

Uh oh!

codecov-commenter commented Jun 4, 2026 •

edited

Loading

Uh oh!

This comment was marked as outdated.

Uh oh!

PaddlePaddle-bot commented Jun 6, 2026 •

edited

Loading

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

PaddlePaddle-bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

wuyujiji commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Usage or Command

Accuracy Tests

Checklist

Uh oh!

This comment was marked as outdated.

Uh oh!

codecov-commenter commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

This comment was marked as outdated.

Uh oh!

PaddlePaddle-bot commented Jun 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

1 Required任务 : 10/10 通过

2 失败详情

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

PaddlePaddle-bot left a comment

Choose a reason for hiding this comment

📋 Review 摘要

问题

历史 Findings 修复情况

📝 PR 规范检查

总体评价

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

wuyujiji commented Jun 4, 2026 •

edited

Loading

codecov-commenter commented Jun 4, 2026 •

edited

Loading

PaddlePaddle-bot commented Jun 6, 2026 •

edited

Loading