Skip to content

issue/1211 - Handle the partial_rotary_factor in nn::RoPE module.#1212

Open
pengcheng888 wants to merge 1 commit into
mainfrom
issue/1211
Open

issue/1211 - Handle the partial_rotary_factor in nn::RoPE module.#1212
pengcheng888 wants to merge 1 commit into
mainfrom
issue/1211

Conversation

@pengcheng888

Copy link
Copy Markdown
Collaborator

rope的forward修改后,模块单独测试通过:

Screenshot from 2026-06-08 13-41-54

9g8b双卡:
python examples/test_infer.py --device nvidia --prompt "介绍下你自己" --model=$MODEL --tp=2 --enable-paged-attn --attn paged-attn --max-new-tokens 64

Screenshot from 2026-06-08 13-44-16

tiny-llama单卡/双卡, static/paged:

python examples/test_infer.py --device nvidia --prompt "介绍下你自己" --model=$MODEL --tp=1 --enable-paged-attn --attn paged-attn --max-new-tokens 64

Screenshot from 2026-06-08 13-45-25

python examples/test_infer.py --device nvidia --prompt "介绍下你自己" --model=$MODEL --tp=1 --max-new-tokens 64

Screenshot from 2026-06-08 13-46-12

python examples/test_infer.py --device nvidia --prompt "介绍下你自己" --model=$MODEL --tp=2 --max-new-tokens 64
Screenshot from 2026-06-08 13-46-36

GLM-4双卡

python examples/test_infer.py --device nvidia --prompt "介绍下你自己" --model=$MODEL --tp=2 --enable-paged-attn --attn paged-attn --max-new-tokens 64

Screenshot from 2026-06-08 13-47-25

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[DEV] 在nn::RoPE模块中处理partial_rotary_factor逻辑

1 participant