Skip to content

clean code#8044

Open
zhoutianzi666 wants to merge 4 commits into
PaddlePaddle:developfrom
zhoutianzi666:clean_dsa
Open

clean code#8044
zhoutianzi666 wants to merge 4 commits into
PaddlePaddle:developfrom
zhoutianzi666:clean_dsa

Conversation

@zhoutianzi666

Copy link
Copy Markdown
Collaborator

Motivation

💡 If this PR is a Cherry Pick, the PR title needs to follow the format by adding the [Cherry-Pick] label at the very beginning and appending the original PR ID at the end. For example, [Cherry-Pick][CI] Add check trigger and logic(#5191)

💡 如若此PR是Cherry Pick,PR标题需遵循格式,在最开始加上[Cherry-Pick]标签,以及最后面加上原PR ID,例如[Cherry-Pick][CI] Add check trigger and logic(#5191)

Modifications

Usage or Command

Accuracy Tests

Checklist

  • Add at least a tag in the PR title.
    • Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
    • You can add new tags based on the PR content, but the semantics must be clear.
  • Format your code, run pre-commit before commit.
  • Add unit tests. Please write the reason in this PR if no unit tests.
  • Provide accuracy results.
  • If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

PaddlePaddle-bot

This comment was marked as outdated.

@codecov-commenter

codecov-commenter commented Jun 15, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 14.54545% with 47 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (develop@02a0042). Learn more about missing BASE report.

Files with missing lines Patch % Lines
fastdeploy/model_executor/models/deepseek_v3.py 20.00% 24 Missing ⚠️
...executor/layers/attention/dsa_attention_backend.py 8.00% 23 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             develop    #8044   +/-   ##
==========================================
  Coverage           ?   67.39%           
==========================================
  Files              ?      475           
  Lines              ?    66708           
  Branches           ?    10288           
==========================================
  Hits               ?    44959           
  Misses             ?    18878           
  Partials           ?     2871           
Flag Coverage Δ
GPU 77.35% <14.54%> (?)
XPU 6.97% <0.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@PaddlePaddle-bot PaddlePaddle-bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Paddle-CI-Agent | pr_review | 2026-06-15 17:00:38

📋 Review 摘要

PR 概述:重构 DSA attention 的 KV 构造和静态 forward 路径,并新增未启用的 SWA indexer helper,同时删除原 DSA backend 单测文件。
变更范围fastdeploy/model_executor/layers/attention/fastdeploy/model_executor/models/tests/layers/
影响面 Tag[Models] [OP]

问题

级别 文件 概述
🟡 建议 fastdeploy/model_executor/layers/attention/dsa_attention_backend.py:344 新增 forward_static() 承接核心 DSA 前向路径,但对应 forward/mixed 分支测试被整体删除且无替代覆盖

历史 Findings 修复情况

Finding 问题 状态
F1 这个 fallback 分支当前被 if False 保护,但其中取 latent cache 的下标和 DSA cache 布局不一致。 ⚠️ 仍存在

📝 PR 规范检查

标题缺少官方 Tag,PR 描述保留了模板占位内容,Checklist 未按本次变更更新。可直接替换为以下内容。

标题建议(可直接复制):

  • [OP] Refactor DSA attention KV construction
PR 描述建议(点击展开,可直接复制)
## Motivation
重构 DSA attention 的 KV 输入构造和 backend 静态调用路径,减少 DeepSeekV3 DSA 模型侧重复拼接 KV 的逻辑。

## Modifications
-`DSAAttentionBackend` 中新增 `forward_static()`,由 backend 统一写入 DSA KV cache、拼接 `compressed_kv``k_pe`,并处理 prefill/decode 的 flash_mla 调用。
-`DeepseekV32DSAAttention.forward()` 中移除模型侧 `kv` 拼接和 `k` 参数传递,改为只传 `compressed_kv``k_pe``indexer_top_k`- 新增一个当前未启用的 SWA indexer top-k Triton helper 和 DSA fallback 分支。

## Usage or Command
N/A

## Accuracy Tests
N/A

## Checklist

- [x] Add at least a tag in the PR title.
  - Tag list: [`[FDConfig]`,`[APIServer]`,`[Engine]`, `[Scheduler]`, `[PD Disaggregation]`, `[Executor]`, `[Graph Optimization]`, `[Speculative Decoding]`, `[RL]`, `[Models]`, `[Quantization]`, `[Loader]`, `[OP]`, `[KVCache]`, `[DataProcessor]`, `[BugFix]`, `[Docs]`, `[CI]`, `[Optimization]`, `[Feature]`, `[Benchmark]`, `[Others]`, `[XPU]`, `[HPU]`, `[GCU]`, `[DCU]`, `[Iluvatar]`, `[Metax]`]
  - You can add new tags based on the PR content, but the semantics must be clear.
- [ ] Format your code, run `pre-commit` before commit.
- [ ] Add unit tests. Please write the reason in this PR if no unit tests.
- [ ] Provide accuracy results.
- [x] If the current PR is submitting to the `release` branch, make sure the PR has been submitted to the `develop` branch, then cherry-pick it to the `release` branch with the `[Cherry-Pick]` PR tag.

总体评价

主路径重构本身没有发现新的阻塞性逻辑错误;但新增 forward_static() 覆盖了 DSA prefill/decode/mixed 的核心行为,删除原测试后需要补齐最小回归保护。历史 finding F1 对应的 if False 分支仍在当前 diff 中,且仍使用 forward_meta.caches[self.layer_id]

return res

@staticmethod
def forward_static(

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 建议 这个新增的 forward_static() 承接了 DSA 的 cache 写入、prefill、decode 和 mixed merge 逻辑,但本 PR 同时删除了原来覆盖 forward_mixed 分支的 tests/layers/test_dsa_attention_backend.py,没有补充替代测试。

当前保留的 tests/layers/test_dsa_attention_kv_cache.py 只覆盖 cache shape/create_host_kv_cache,无法守住这里的输出 shape、cache 写入和 indexer_topk 传参。建议保留或重写轻量单测,至少覆盖 prefill-only、decode-only、prefill+decode merge 三条分支,并 mock flash_mla/dsk_attn_write_cache 验证 compressed_kv + k_pe 在 backend 内构造后的调用参数。

@PaddlePaddle-bot

PaddlePaddle-bot commented Jun 15, 2026

Copy link
Copy Markdown

🤖 Paddle-CI-Agent | ci_status_monitor | 2026-06-16 03:51:20 UTC+08:00

CI报告基于以下代码生成(30分钟更新一次):
PR commit: 5a4d660 | Merge base: 02a0042 (branch: develop)


1 Required任务 : 9/10 通过

总执行(rerun次数) 总任务 ✅ 通过 ❌ 失败 ⏳ 运行中 ⏸️ 等待中 跳过
42(0) 42 37 5 0 0 0
任务 错误类型 置信度 日志
Run FastDeploy Unit Tests and Coverage / run_tests_with_coverage PR问题 Job

2 失败详情

🔴 Run FastDeploy Unit Tests and Coverage / run_tests_with_coverage — PR问题(置信度: 高)

错误类型: PR问题 | 置信度: 高
分析器: ci_analyze_unittest_fastdeploy
失败用例: 覆盖率门禁失败,单测本身通过

用例 错误摘要
diff-cover python_coverage_all.xml PR diff 覆盖率 14%,低于 80% 阈值

关键日志:

Failure. Coverage is below 80%.
fastdeploy/model_executor/layers/attention/dsa_attention_backend.py (8.0%): Missing lines 338,341,354-359,375-376,380-382,384,387,395-396,399,406-408,410,431
fastdeploy/model_executor/models/deepseek_v3.py (20.0%): Missing lines 92,94,96-98,100,102-105,107,109-110,112-113,115-117,128-129,131-132,134,610
GPU Patch Coverage Details: total_num_lines=55, total_num_violations=47, total_percent_covered=14, num_changed_lines=170
TEST_EXIT_CODE: 0
COVERAGE_EXIT_CODE: 9
  • 根因摘要: 新增/修改代码缺少覆盖率
    PR 新增 DSAAttentionBackend.forward_staticget_swa_indexer_top_k Triton kernel 及 DeepSeek V3 调用路径,并删除 tests/layers/test_dsa_attention_backend.py,导致 diff coverage 纳入的 55 行仅 8 行被覆盖。CI 的单测步骤已通过,但覆盖率校验步骤按 --fail-under=80 返回 exit code 9。

修复建议:

  1. 补充或恢复覆盖 fastdeploy/model_executor/layers/attention/dsa_attention_backend.pyforward_mixed/forward_static 新逻辑的单测,覆盖 prefill、decode、head padding、cache shape 等分支。
  2. fastdeploy/model_executor/models/deepseek_v3.pyget_swa_indexer_top_k 及新增调用路径增加测试;如果该 if False 调试分支不应进入主线,建议移除以减少无效覆盖率负担。

关联变更: fastdeploy/model_executor/layers/attention/dsa_attention_backend.py:338fastdeploy/model_executor/models/deepseek_v3.py:92fastdeploy/model_executor/models/deepseek_v3.py:610tests/layers/test_dsa_attention_backend.py 删除

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants