Skip to content

Update glm-5 b200 sglang image to nightly-dev-cu13-20260523-c112f762#1567

Open
Ankur-singh wants to merge 1 commit into
mainfrom
glm-update
Open

Update glm-5 b200 sglang image to nightly-dev-cu13-20260523-c112f762#1567
Ankur-singh wants to merge 1 commit into
mainfrom
glm-update

Conversation

@Ankur-singh
Copy link
Copy Markdown
Collaborator

@Ankur-singh Ankur-singh commented May 26, 2026

Bumps the SGLang container image for the four GLM-5 B200 recipes (glm5-fp8-b200-sglang, glm5-fp8-b200-sglang-mtp, glm5-fp4-b200-sglang, glm5-fp4-b200-sglang-mtp) from lmsysorg/sglang:v0.5.12-cu130 to lmsysorg/sglang:nightly-dev-cu13-20260523-c112f762. Models and TP/EP search spaces are unchanged.

Mirrors #1561 (xinli-sw:glm-update) re-based on current main.


Note

Low Risk
Config-only container pin updates for benchmark recipes; no application logic, auth, or data handling changes.

Overview
Updates the SGLang container image for four GLM-5 B200 benchmark recipes in nvidia-master.yaml: glm5-fp8-b200-sglang, glm5-fp8-b200-sglang-mtp, glm5-fp4-b200-sglang, and glm5-fp4-b200-sglang-mtp. Each switches from lmsysorg/sglang:v0.5.12-cu130 to lmsysorg/sglang:nightly-dev-cu13-20260523-c112f762. Models, runners, and TP/EP search spaces are unchanged.

Adds matching perf-changelog entries for those four config keys (noting the image bump; changelog text references v0.5.11-cu130 as the prior baseline).

Reviewed by Cursor Bugbot for commit d06b958. Bugbot is set up for automated code reviews on this repo. Configure here.

@github-actions
Copy link
Copy Markdown
Contributor

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

1 similar comment
@github-actions
Copy link
Copy Markdown
Contributor

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — straightforward container image bump for four GLM-5 B200 SGLang recipes; the changelog stale-baseline/pr-link issue flagged inline is a documentation nit and doesn't affect runtime behavior.

Extended reasoning...

Overview

This PR bumps the SGLang container image for four GLM-5 B200 recipes in .github/configs/nvidia-master.yaml from lmsysorg/sglang:v0.5.12-cu130 to lmsysorg/sglang:nightly-dev-cu13-20260523-c112f762, and adds matching entries to perf-changelog.yaml. Models, runners, and TP/EP search spaces are unchanged — only the image pin moves. Total change is 8 modified lines plus 24 added changelog lines across 2 files.

Security risks

None. These are config-only changes to benchmark recipe definitions. No application logic, auth surface, crypto, permissions, or data-handling paths are touched. The image is pinned to a specific digest-bearing nightly tag (nightly-dev-cu13-20260523-c112f762), which is consistent with other recently merged image-bump PRs in this repo (e.g. #1555, #1516, #1420, #1563).

Level of scrutiny

Low. This is a mechanical config tweak following an established pattern — the recent commit log shows multiple near-identical image-bump PRs landing routinely. The diff is small, self-contained, and reversible via a single revert (as #1563 demonstrated for the sibling glm5-fp4-b300 PR). It does not touch CODEOWNER-protected logic.

Other factors

The inline bug comment flags two documentation-only issues in the new changelog entries: the description says 'from v0.5.11-cu130' when the actual prior pin was v0.5.12-cu130, and the pr-link points to the abandoned precursor PR #1561 rather than this one (#1567). Both are stale-from-rebase artifacts and worth fixing for changelog traceability, but they have zero runtime impact and don't justify blocking the image bump. The Cursor bot summary already acknowledged the v0.5.11 reference, so the author has visibility into it.

No prior reviews from me on this PR; no outstanding human reviewer comments to address.

Comment thread perf-changelog.yaml
Comment on lines +3133 to +3155
- config-keys:
- glm5-fp4-b200-sglang
description:
- "Update SGLang image from v0.5.11-cu130 to nightly-dev-cu13-20260523-c112f762"
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1561

- config-keys:
- glm5-fp4-b200-sglang-mtp
description:
- "Update SGLang image from v0.5.11-cu130 to nightly-dev-cu13-20260523-c112f762"
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1561

- config-keys:
- glm5-fp8-b200-sglang
description:
- "Update SGLang image from v0.5.11-cu130 to nightly-dev-cu13-20260523-c112f762"
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1561

- config-keys:
- glm5-fp8-b200-sglang-mtp
description:
- "Update SGLang image from v0.5.11-cu130 to nightly-dev-cu13-20260523-c112f762"
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1561
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 The four new perf-changelog entries (lines 3133-3155) carry stale info from the abandoned precursor PR #1561: (1) the description says 'from v0.5.11-cu130' but the diff at nvidia-master.yaml lines 2210/2229/2309/2330 shows the prior image was v0.5.12-cu130 (off by one minor version), and (2) the pr-link points to #1561 (the precursor) rather than this PR (#1567) which actually lands the change. Both are documentation-only nits, but should be corrected before merge — update the baseline to v0.5.12-cu130 and the pr-link to #1567 so future readers can trace the actual delta and merge commit.

Extended reasoning...

What the bug is

This PR adds four new entries to perf-changelog.yaml (lines 3133-3155), one for each glm5 b200 sglang recipe whose image is being bumped. Each entry contains two pieces of stale information copied from the abandoned precursor PR #1561:

  1. Wrong baseline version in description. All four entries read "Update SGLang image from v0.5.11-cu130 to nightly-dev-cu13-20260523-c112f762". But the actual previous image — visible in the - lines of the diff at .github/configs/nvidia-master.yaml lines 2210, 2229, 2309, 2330 — was lmsysorg/sglang:v0.5.12-cu130, not v0.5.11. The PR description itself acknowledges this: it says the bump is from lmsysorg/sglang:v0.5.12-cu130, and the Cursor-bot summary embedded in the PR body explicitly calls out that "changelog text references v0.5.11-cu130 as the prior baseline".

  2. Wrong pr-link. All four entries set pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1561. But per the PR description, this PR "Mirrors Update glm-5 container to use SGLang latest #1561 (xinli-sw:glm-update) re-based on current main" — so Update glm-5 container to use SGLang latest #1561 is the abandoned precursor and Update glm-5 b200 sglang image to nightly-dev-cu13-20260523-c112f762 #1567 is the actual PR that will land. The convention elsewhere in perf-changelog.yaml is that pr-link points to the PR that actually introduces the change (e.g. the immediately preceding entries at lines 3107, 3115, 3125, 3131 link to [Klaud Cold] qwen3.5-fp8-mi355x-atom-mtp: enable --use-chat-template #1555, [NV] update Minimax2.5 fp8 h100 vllm #1516, etc., matching the merge commits in the recent git log).

Why these are both stale-from-rebase artifacts

Git log confirms commit 8e0f658 (PR #1447) already bumped these four recipes from v0.5.11 to v0.5.12 prior to this PR. So the changelog text "from v0.5.11" was accurate at the time #1561 was first authored, but became stale once #1561 was rebased onto current main (where v0.5.12 was already in place) and resubmitted as #1567. The pr-link similarly carries the original PR number, not the rebase-mirror PR number.

Step-by-step proof

  1. Open the PR diff for .github/configs/nvidia-master.yaml. At line 2210 the removed line is - image: lmsysorg/sglang:v0.5.12-cu130 (same at 2229, 2309, 2330). So the actual prior image is v0.5.12-cu130.
  2. Open the PR diff for perf-changelog.yaml. Lines 3136, 3142, 3148, 3154 all say "Update SGLang image from v0.5.11-cu130 to nightly-dev-cu13-20260523-c112f762". Compare to step 1: v0.5.11 ≠ v0.5.12.
  3. Lines 3137, 3143, 3149, 3155 all set pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1561. The PR being reviewed is Update glm-5 b200 sglang image to nightly-dev-cu13-20260523-c112f762 #1567, which per its own description "Mirrors Update glm-5 container to use SGLang latest #1561 ... re-based on current main".
  4. Scanning the immediately preceding changelog entries (lines 3107, 3115, 3125, 3131), pr-links are 1555, 1516, etc., each matching the PR that actually introduced the change (verifiable via git log against commits d4948f9 and 298d8f9).

Impact

Documentation-only. No runtime effect. The cost is purely traceability: anyone reading perf-changelog.yaml in the future to understand the v0.5.12→nightly delta will (a) see the wrong starting version, and (b) follow the pr-link into a closed, abandoned PR rather than the merged commit.

Fix

In the four new entries in perf-changelog.yaml, change:

  • description: v0.5.11-cu130v0.5.12-cu130
  • pr-link: /pull/1561/pull/1567

@github-actions
Copy link
Copy Markdown
Contributor

5 similar comments
@github-actions
Copy link
Copy Markdown
Contributor

@github-actions
Copy link
Copy Markdown
Contributor

@github-actions
Copy link
Copy Markdown
Contributor

@github-actions
Copy link
Copy Markdown
Contributor

@github-actions
Copy link
Copy Markdown
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

1 participant