Update glm-5 b200 sglang image to nightly-dev-cu13-20260523-c112f762#1567
Update glm-5 b200 sglang image to nightly-dev-cu13-20260523-c112f762#1567Ankur-singh wants to merge 1 commit into
Conversation
|
Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers. If additional help is needed, PR authors can reach out to core maintainers over Slack. |
1 similar comment
|
Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers. If additional help is needed, PR authors can reach out to core maintainers over Slack. |
There was a problem hiding this comment.
LGTM — straightforward container image bump for four GLM-5 B200 SGLang recipes; the changelog stale-baseline/pr-link issue flagged inline is a documentation nit and doesn't affect runtime behavior.
Extended reasoning...
Overview
This PR bumps the SGLang container image for four GLM-5 B200 recipes in .github/configs/nvidia-master.yaml from lmsysorg/sglang:v0.5.12-cu130 to lmsysorg/sglang:nightly-dev-cu13-20260523-c112f762, and adds matching entries to perf-changelog.yaml. Models, runners, and TP/EP search spaces are unchanged — only the image pin moves. Total change is 8 modified lines plus 24 added changelog lines across 2 files.
Security risks
None. These are config-only changes to benchmark recipe definitions. No application logic, auth surface, crypto, permissions, or data-handling paths are touched. The image is pinned to a specific digest-bearing nightly tag (nightly-dev-cu13-20260523-c112f762), which is consistent with other recently merged image-bump PRs in this repo (e.g. #1555, #1516, #1420, #1563).
Level of scrutiny
Low. This is a mechanical config tweak following an established pattern — the recent commit log shows multiple near-identical image-bump PRs landing routinely. The diff is small, self-contained, and reversible via a single revert (as #1563 demonstrated for the sibling glm5-fp4-b300 PR). It does not touch CODEOWNER-protected logic.
Other factors
The inline bug comment flags two documentation-only issues in the new changelog entries: the description says 'from v0.5.11-cu130' when the actual prior pin was v0.5.12-cu130, and the pr-link points to the abandoned precursor PR #1561 rather than this one (#1567). Both are stale-from-rebase artifacts and worth fixing for changelog traceability, but they have zero runtime impact and don't justify blocking the image bump. The Cursor bot summary already acknowledged the v0.5.11 reference, so the author has visibility into it.
No prior reviews from me on this PR; no outstanding human reviewer comments to address.
| - config-keys: | ||
| - glm5-fp4-b200-sglang | ||
| description: | ||
| - "Update SGLang image from v0.5.11-cu130 to nightly-dev-cu13-20260523-c112f762" | ||
| pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1561 | ||
|
|
||
| - config-keys: | ||
| - glm5-fp4-b200-sglang-mtp | ||
| description: | ||
| - "Update SGLang image from v0.5.11-cu130 to nightly-dev-cu13-20260523-c112f762" | ||
| pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1561 | ||
|
|
||
| - config-keys: | ||
| - glm5-fp8-b200-sglang | ||
| description: | ||
| - "Update SGLang image from v0.5.11-cu130 to nightly-dev-cu13-20260523-c112f762" | ||
| pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1561 | ||
|
|
||
| - config-keys: | ||
| - glm5-fp8-b200-sglang-mtp | ||
| description: | ||
| - "Update SGLang image from v0.5.11-cu130 to nightly-dev-cu13-20260523-c112f762" | ||
| pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1561 |
There was a problem hiding this comment.
🟡 The four new perf-changelog entries (lines 3133-3155) carry stale info from the abandoned precursor PR #1561: (1) the description says 'from v0.5.11-cu130' but the diff at nvidia-master.yaml lines 2210/2229/2309/2330 shows the prior image was v0.5.12-cu130 (off by one minor version), and (2) the pr-link points to #1561 (the precursor) rather than this PR (#1567) which actually lands the change. Both are documentation-only nits, but should be corrected before merge — update the baseline to v0.5.12-cu130 and the pr-link to #1567 so future readers can trace the actual delta and merge commit.
Extended reasoning...
What the bug is
This PR adds four new entries to perf-changelog.yaml (lines 3133-3155), one for each glm5 b200 sglang recipe whose image is being bumped. Each entry contains two pieces of stale information copied from the abandoned precursor PR #1561:
-
Wrong baseline version in description. All four entries read
"Update SGLang image from v0.5.11-cu130 to nightly-dev-cu13-20260523-c112f762". But the actual previous image — visible in the-lines of the diff at.github/configs/nvidia-master.yamllines 2210, 2229, 2309, 2330 — waslmsysorg/sglang:v0.5.12-cu130, not v0.5.11. The PR description itself acknowledges this: it says the bump is fromlmsysorg/sglang:v0.5.12-cu130, and the Cursor-bot summary embedded in the PR body explicitly calls out that "changelog text references v0.5.11-cu130 as the prior baseline". -
Wrong pr-link. All four entries set
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1561. But per the PR description, this PR "Mirrors Update glm-5 container to use SGLang latest #1561 (xinli-sw:glm-update) re-based on current main" — so Update glm-5 container to use SGLang latest #1561 is the abandoned precursor and Update glm-5 b200 sglang image to nightly-dev-cu13-20260523-c112f762 #1567 is the actual PR that will land. The convention elsewhere inperf-changelog.yamlis thatpr-linkpoints to the PR that actually introduces the change (e.g. the immediately preceding entries at lines 3107, 3115, 3125, 3131 link to [Klaud Cold] qwen3.5-fp8-mi355x-atom-mtp: enable --use-chat-template #1555, [NV] update Minimax2.5 fp8 h100 vllm #1516, etc., matching the merge commits in the recent git log).
Why these are both stale-from-rebase artifacts
Git log confirms commit 8e0f658 (PR #1447) already bumped these four recipes from v0.5.11 to v0.5.12 prior to this PR. So the changelog text "from v0.5.11" was accurate at the time #1561 was first authored, but became stale once #1561 was rebased onto current main (where v0.5.12 was already in place) and resubmitted as #1567. The pr-link similarly carries the original PR number, not the rebase-mirror PR number.
Step-by-step proof
- Open the PR diff for
.github/configs/nvidia-master.yaml. At line 2210 the removed line is- image: lmsysorg/sglang:v0.5.12-cu130(same at 2229, 2309, 2330). So the actual prior image is v0.5.12-cu130. - Open the PR diff for
perf-changelog.yaml. Lines 3136, 3142, 3148, 3154 all say"Update SGLang image from v0.5.11-cu130 to nightly-dev-cu13-20260523-c112f762". Compare to step 1: v0.5.11 ≠ v0.5.12. - Lines 3137, 3143, 3149, 3155 all set
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1561. The PR being reviewed is Update glm-5 b200 sglang image to nightly-dev-cu13-20260523-c112f762 #1567, which per its own description "Mirrors Update glm-5 container to use SGLang latest #1561 ... re-based on current main". - Scanning the immediately preceding changelog entries (lines 3107, 3115, 3125, 3131), pr-links are 1555, 1516, etc., each matching the PR that actually introduced the change (verifiable via
git logagainst commitsd4948f9and298d8f9).
Impact
Documentation-only. No runtime effect. The cost is purely traceability: anyone reading perf-changelog.yaml in the future to understand the v0.5.12→nightly delta will (a) see the wrong starting version, and (b) follow the pr-link into a closed, abandoned PR rather than the merged commit.
Fix
In the four new entries in perf-changelog.yaml, change:
- description:
v0.5.11-cu130→v0.5.12-cu130 - pr-link:
/pull/1561→/pull/1567
|
see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=26465009212 |
5 similar comments
|
see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=26465009212 |
|
see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=26465009212 |
|
see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=26465009212 |
|
see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=26465009212 |
|
see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=26465009212 |
Bumps the SGLang container image for the four GLM-5 B200 recipes (
glm5-fp8-b200-sglang,glm5-fp8-b200-sglang-mtp,glm5-fp4-b200-sglang,glm5-fp4-b200-sglang-mtp) fromlmsysorg/sglang:v0.5.12-cu130tolmsysorg/sglang:nightly-dev-cu13-20260523-c112f762. Models and TP/EP search spaces are unchanged.Mirrors #1561 (xinli-sw:glm-update) re-based on current
main.Note
Low Risk
Config-only container pin updates for benchmark recipes; no application logic, auth, or data handling changes.
Overview
Updates the SGLang container image for four GLM-5 B200 benchmark recipes in
nvidia-master.yaml:glm5-fp8-b200-sglang,glm5-fp8-b200-sglang-mtp,glm5-fp4-b200-sglang, andglm5-fp4-b200-sglang-mtp. Each switches fromlmsysorg/sglang:v0.5.12-cu130tolmsysorg/sglang:nightly-dev-cu13-20260523-c112f762. Models, runners, and TP/EP search spaces are unchanged.Adds matching perf-changelog entries for those four config keys (noting the image bump; changelog text references v0.5.11-cu130 as the prior baseline).
Reviewed by Cursor Bugbot for commit d06b958. Bugbot is set up for automated code reviews on this repo. Configure here.