Update glm-5 b200 sglang image to nightly-dev-cu13-20260523-c112f762 by Ankur-singh · Pull Request #1567 · SemiAnalysisAI/InferenceX

Ankur-singh · 2026-05-26T17:42:16Z

Bumps the SGLang container image for the four GLM-5 B200 recipes (glm5-fp8-b200-sglang, glm5-fp8-b200-sglang-mtp, glm5-fp4-b200-sglang, glm5-fp4-b200-sglang-mtp) from lmsysorg/sglang:v0.5.12-cu130 to lmsysorg/sglang:nightly-dev-cu13-20260523-c112f762. Models and TP/EP search spaces are unchanged.

Mirrors #1561 (xinli-sw:glm-update) re-based on current main.

Note

Low Risk
Config-only container pin updates for benchmark recipes; no application logic, auth, or data handling changes.

Overview
Updates the SGLang container image for four GLM-5 B200 benchmark recipes in nvidia-master.yaml: glm5-fp8-b200-sglang, glm5-fp8-b200-sglang-mtp, glm5-fp4-b200-sglang, and glm5-fp4-b200-sglang-mtp. Each switches from lmsysorg/sglang:v0.5.12-cu130 to lmsysorg/sglang:nightly-dev-cu13-20260523-c112f762. Models, runners, and TP/EP search spaces are unchanged.

Adds matching perf-changelog entries for those four config keys (noting the image bump; changelog text references v0.5.11-cu130 as the prior baseline).

^{Reviewed by Cursor Bugbot for commit d06b958. Bugbot is set up for automated code reviews on this repo. Configure here.}

…-c112f762

github-actions · 2026-05-26T17:42:24Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

github-actions · 2026-05-26T17:42:24Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

claude

LGTM — straightforward container image bump for four GLM-5 B200 SGLang recipes; the changelog stale-baseline/pr-link issue flagged inline is a documentation nit and doesn't affect runtime behavior.

Extended reasoning...

Overview

This PR bumps the SGLang container image for four GLM-5 B200 recipes in .github/configs/nvidia-master.yaml from lmsysorg/sglang:v0.5.12-cu130 to lmsysorg/sglang:nightly-dev-cu13-20260523-c112f762, and adds matching entries to perf-changelog.yaml. Models, runners, and TP/EP search spaces are unchanged — only the image pin moves. Total change is 8 modified lines plus 24 added changelog lines across 2 files.

Security risks

None. These are config-only changes to benchmark recipe definitions. No application logic, auth surface, crypto, permissions, or data-handling paths are touched. The image is pinned to a specific digest-bearing nightly tag (nightly-dev-cu13-20260523-c112f762), which is consistent with other recently merged image-bump PRs in this repo (e.g. #1555, #1516, #1420, #1563).

Level of scrutiny

Low. This is a mechanical config tweak following an established pattern — the recent commit log shows multiple near-identical image-bump PRs landing routinely. The diff is small, self-contained, and reversible via a single revert (as #1563 demonstrated for the sibling glm5-fp4-b300 PR). It does not touch CODEOWNER-protected logic.

Other factors

The inline bug comment flags two documentation-only issues in the new changelog entries: the description says 'from v0.5.11-cu130' when the actual prior pin was v0.5.12-cu130, and the pr-link points to the abandoned precursor PR #1561 rather than this one (#1567). Both are stale-from-rebase artifacts and worth fixing for changelog traceability, but they have zero runtime impact and don't justify blocking the image bump. The Cursor bot summary already acknowledged the v0.5.11 reference, so the author has visibility into it.

No prior reviews from me on this PR; no outstanding human reviewer comments to address.

claude · 2026-05-26T17:46:36Z

+- config-keys:
+    - glm5-fp4-b200-sglang
+  description:
+    - "Update SGLang image from v0.5.11-cu130 to nightly-dev-cu13-20260523-c112f762"
+  pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1561
+
+- config-keys:
+    - glm5-fp4-b200-sglang-mtp
+  description:
+    - "Update SGLang image from v0.5.11-cu130 to nightly-dev-cu13-20260523-c112f762"
+  pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1561
+
+- config-keys:
+    - glm5-fp8-b200-sglang
+  description:
+    - "Update SGLang image from v0.5.11-cu130 to nightly-dev-cu13-20260523-c112f762"
+  pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1561
+
+- config-keys:
+    - glm5-fp8-b200-sglang-mtp
+  description:
+    - "Update SGLang image from v0.5.11-cu130 to nightly-dev-cu13-20260523-c112f762"
+  pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1561


🟡 The four new perf-changelog entries (lines 3133-3155) carry stale info from the abandoned precursor PR #1561: (1) the description says 'from v0.5.11-cu130' but the diff at nvidia-master.yaml lines 2210/2229/2309/2330 shows the prior image was v0.5.12-cu130 (off by one minor version), and (2) the pr-link points to #1561 (the precursor) rather than this PR (#1567) which actually lands the change. Both are documentation-only nits, but should be corrected before merge — update the baseline to v0.5.12-cu130 and the pr-link to #1567 so future readers can trace the actual delta and merge commit.

Extended reasoning...

What the bug is

This PR adds four new entries to perf-changelog.yaml (lines 3133-3155), one for each glm5 b200 sglang recipe whose image is being bumped. Each entry contains two pieces of stale information copied from the abandoned precursor PR #1561:

Wrong baseline version in description. All four entries read "Update SGLang image from v0.5.11-cu130 to nightly-dev-cu13-20260523-c112f762". But the actual previous image — visible in the - lines of the diff at .github/configs/nvidia-master.yaml lines 2210, 2229, 2309, 2330 — was lmsysorg/sglang:v0.5.12-cu130, not v0.5.11. The PR description itself acknowledges this: it says the bump is from lmsysorg/sglang:v0.5.12-cu130, and the Cursor-bot summary embedded in the PR body explicitly calls out that "changelog text references v0.5.11-cu130 as the prior baseline".

Wrong pr-link. All four entries set pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1561. But per the PR description, this PR "Mirrors Update glm-5 container to use SGLang latest #1561 (xinli-sw:glm-update) re-based on current main" — so Update glm-5 container to use SGLang latest #1561 is the abandoned precursor and Update glm-5 b200 sglang image to nightly-dev-cu13-20260523-c112f762 #1567 is the actual PR that will land. The convention elsewhere in perf-changelog.yaml is that pr-link points to the PR that actually introduces the change (e.g. the immediately preceding entries at lines 3107, 3115, 3125, 3131 link to [Klaud Cold] qwen3.5-fp8-mi355x-atom-mtp: enable --use-chat-template #1555, [NV] update Minimax2.5 fp8 h100 vllm #1516, etc., matching the merge commits in the recent git log).

Why these are both stale-from-rebase artifacts

Git log confirms commit 8e0f658 (PR #1447) already bumped these four recipes from v0.5.11 to v0.5.12 prior to this PR. So the changelog text "from v0.5.11" was accurate at the time #1561 was first authored, but became stale once #1561 was rebased onto current main (where v0.5.12 was already in place) and resubmitted as #1567. The pr-link similarly carries the original PR number, not the rebase-mirror PR number.

Step-by-step proof

Open the PR diff for .github/configs/nvidia-master.yaml. At line 2210 the removed line is - image: lmsysorg/sglang:v0.5.12-cu130 (same at 2229, 2309, 2330). So the actual prior image is v0.5.12-cu130.

Open the PR diff for perf-changelog.yaml. Lines 3136, 3142, 3148, 3154 all say "Update SGLang image from v0.5.11-cu130 to nightly-dev-cu13-20260523-c112f762". Compare to step 1: v0.5.11 ≠ v0.5.12.

Lines 3137, 3143, 3149, 3155 all set pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1561. The PR being reviewed is Update glm-5 b200 sglang image to nightly-dev-cu13-20260523-c112f762 #1567, which per its own description "Mirrors Update glm-5 container to use SGLang latest #1561 ... re-based on current main".

Scanning the immediately preceding changelog entries (lines 3107, 3115, 3125, 3131), pr-links are 1555, 1516, etc., each matching the PR that actually introduced the change (verifiable via git log against commits d4948f9 and 298d8f9).

Impact

Documentation-only. No runtime effect. The cost is purely traceability: anyone reading perf-changelog.yaml in the future to understand the v0.5.12→nightly delta will (a) see the wrong starting version, and (b) follow the pr-link into a closed, abandoned PR rather than the merged commit.

Fix

In the four new entries in perf-changelog.yaml, change:

description: v0.5.11-cu130 → v0.5.12-cu130

pr-link: /pull/1561 → /pull/1567

github-actions · 2026-05-26T18:16:03Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=26465009212
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=26465009212

github-actions · 2026-05-26T19:25:51Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=26465009212
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=26465009212

github-actions · 2026-05-26T20:04:16Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=26465009212
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=26465009212

github-actions · 2026-05-26T20:11:46Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=26465009212
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=26465009212

github-actions · 2026-05-26T20:48:48Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=26465009212
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=26465009212

github-actions · 2026-05-27T16:41:35Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=26465009212
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=26465009212

Update glm-5 b200 sglang container image to nightly-dev-cu13-20260523…

d06b958

…-c112f762

Ankur-singh requested a review from a team May 26, 2026 17:42

Ankur-singh requested review from jgangani and kedarpotdar-nv as code owners May 26, 2026 17:42

github-project-automation Bot added this to InferenceMAX Board May 26, 2026

Ankur-singh added the full-sweep-enabled label May 26, 2026

claude Bot reviewed May 26, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update glm-5 b200 sglang image to nightly-dev-cu13-20260523-c112f762#1567

Update glm-5 b200 sglang image to nightly-dev-cu13-20260523-c112f762#1567
Ankur-singh wants to merge 1 commit into
mainfrom
glm-update

Ankur-singh commented May 26, 2026 •

edited by cursor Bot

Loading

Uh oh!

github-actions Bot commented May 26, 2026

Uh oh!

github-actions Bot commented May 26, 2026

Uh oh!

claude Bot left a comment

Uh oh!

claude Bot May 26, 2026

Uh oh!

github-actions Bot commented May 26, 2026

Uh oh!

github-actions Bot commented May 26, 2026

Uh oh!

github-actions Bot commented May 26, 2026

Uh oh!

github-actions Bot commented May 26, 2026

Uh oh!

github-actions Bot commented May 26, 2026

Uh oh!

github-actions Bot commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Ankur-singh commented May 26, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented May 26, 2026

Uh oh!

github-actions Bot commented May 26, 2026

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Overview

Security risks

Level of scrutiny

Other factors

Uh oh!

claude Bot May 26, 2026

Choose a reason for hiding this comment

What the bug is

Why these are both stale-from-rebase artifacts

Step-by-step proof

Impact

Fix

Uh oh!

github-actions Bot commented May 26, 2026

Uh oh!

github-actions Bot commented May 26, 2026

Uh oh!

github-actions Bot commented May 26, 2026

Uh oh!

github-actions Bot commented May 26, 2026

Uh oh!

github-actions Bot commented May 26, 2026

Uh oh!

github-actions Bot commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Ankur-singh commented May 26, 2026 •

edited by cursor Bot

Loading