[0041] New proposal for `testing-maximal-reconvergence` by luciechoi · Pull Request #376 · llvm/wg-hlsl

luciechoi · 2026-01-21T18:02:33Z

This proposal suggests a different approach for having comprehensive test coverage for maximal reconvergence feature in the Clang compiler.

proposals/0039-testing-maximal-reconvergence.md

s-perron

My main comment are:

Use HLSL lingo not SPIR-V.
For the detail design section ask yourself: If you were passing this off to another developer and this is all they had, would they be able to implement it?

proposals/0039-testing-maximal-reconvergence.md

s-perron · 2026-01-22T01:26:58Z

proposals/0039-testing-maximal-reconvergence.md

+
+Graphics compilers often perform aggressive optimizations that can unexpectedly alter the convergence behavior of threads within a wave. This is a critical issue for shaders containing operations dependent on control flow, such as wave intrinsics, as invalid transformations can lead to wrong or indeterminate results.
+
+Maximal reconvergence is a set of compiler guarantees designed to prevent these unintended changes, ensuring that divergent threads reconverge at expected merge points and that wave operations execute in lockstep where intended.


I would not call it a "compiler guarantee". Ideally there should be a specification of where lanes diverge and reconverge, but that does not exist formally in HLSL. The best we currently have is in the wave intrinsic wiki, where it says:

In the model of this document, implementations must enforce that the number of active lanes exactly corresponds to the programmer’s view of flow control.

We should say something along the lines of:

There is an informal definition of which threads are active at any point in execution of the shader.

You can probably start with this, and then merge it with the previous paragraph. You state the rule, and then explain how it could be violated.

Reworded and added an example. PTAL!

proposals/0039-testing-maximal-reconvergence.md

s-perron · 2026-01-22T01:50:32Z

proposals/0039-testing-maximal-reconvergence.md

+## Detailed design
+
+### Test Generation and Simulation
+The shaders will be generated when the test pipeline starts. Since each GPU has different subgroup sizes, each machine will have a version for every power-of-2 wave size between 4 and 32 (e.g., 4, 8, 16, 32). The tests that do not match the subgroup size of the running GPU will be skipped (e.g. through `# UNSUPPORTED: !SubgroupSizeX` directive).


when the test pipeline starts

Does that mean that every time I do ninja clang-hlsl-* they will all be generated, even if those tests will not be run? It might be a good idea to mention the cmake targets you will be adding and what their dependencies will be. That will help me better understand the work flow that you intend.

Right, when we run the test generator, it will always generate the tests in all possible wave sizes. I'm not sure if there is an easy way of checking the supported wave size in the test generator without target specific pipelines.

proposals/0039-testing-maximal-reconvergence.md

s-perron · 2026-01-22T01:57:48Z

proposals/0039-testing-maximal-reconvergence.md

+
+Logic from [Vulkan CTS GLSL generation](https://github.com/KhronosGroup/VK-GL-CTS/blob/main/external/vulkancts/modules/vulkan/reconvergence/vktReconvergenceTests.cpp) will be ported to produce HLSL. This includes translating intrinsics such as `subgroupElect()` to `WaveIsFirstLane()` and `subgroupBallot()` to `WaveActiveBallot()`, etc.
+
+### Execution Pipeline


You might want to give a more detailed flow of how the whole tests will be run:

When target X is built,

generate all of the test files

execute each of the generated tests

delete each of the passing tests

etc.

I'm not clear on what the exact flow will be. I can piece some parts of it together, but putting all together in this section would be useful.

Thanks for pointing out. I've added an example workflow.

Here is the new draft PR with workflow changes. llvm/offload-test-suite#685

s-perron · 2026-01-22T01:59:31Z

proposals/0039-testing-maximal-reconvergence.md

+
+### Reporting
+
+Results of the reconvergence tests will be aggregated. Failing shaders will be logged separately or made available via YAML artifacts to avoid diluting logs with excessive data.


does this mean that the tests that pass will be deleted? where will the failing test be saved?

We can delete the tests in each pipeline run. I'm not sure how easy it is to inspect the artifacts inside those machines. The developers can generate the test locally and inspect the failing test there as well.

s-perron · 2026-01-22T02:02:16Z

proposals/0041-testing-maximal-reconvergence.md

+- Reducing the workgroup size and/or nesting level.
+- Comparing the results with other GPUs and/or backends.
+- Writing a reducer for the randomly generated shaders.


Are you planning on writing tools to perform any of these? Do they need any design work?

Yes, they certainly need design works... I'll try to experiment these ideas after the abstract idea of this proposal is approved

s-perron

Looking much better. There are still some details to work out. My main question about the current approach is that we need to make sure we get the same tests generated on all platforms. We do not want the Github actions failing, but the local builds passing because they generated different shaders. It will be impossible to debug.

How will the "random" values handled to make sure they are consistent?

proposals/0039-testing-maximal-reconvergence.md

s-perron · 2026-01-30T14:49:20Z

proposals/0039-testing-maximal-reconvergence.md

+[This](https://github.com/llvm/offload-test-suite/pull/685) is an example of the
+proposed design.
+
+### Test Generation and Simulation


You might want to fix up the order here. The test generation include the simulation. Here is an example of how I would lay it out. Fill in the text. I just have a few points on what should be included.

Test Generation

Random shaders

// Move the "Transation" section here.
Explicitly call out the intermediate form for the shader, and describe it a bit.

Expected results

// Mention how you will pick the size of the buffer, which is determine by a "dry run" in vulkan.
// Mention that you will get the expected results for each wave size doing a CPU simulation
Mention that the CPU simulation will be done on the intermediate form and not the final SPIR-V shader.

Final test file

// Explain how the find test case will be generated. This can be short as it is simply taking the info from the other step. Say where the new tests will be stored.

Added a detailed section, PTAL

s-perron · 2026-01-30T14:51:07Z

proposals/0039-testing-maximal-reconvergence.md

+separated.
+
+```yaml
+# .github/workflows/build-and-test-callable.yaml


A github workflow? Will I be able to run them locally? I think we should have a cmake target to be able to run the tests locally.

Added a section for the cmake target and updated the sample PR.

s-perron · 2026-01-30T14:53:56Z

proposals/0039-testing-maximal-reconvergence.md

+We don't plan to store the physical test files in the repo. Developers can still
+run the tests locally by running the test generator to output the tests in their
+machine.


You want it to be simple, and have a proper cmake target. You might need to look to how lit works.

There should be a target that will build the shaders. If something is compiler generated to will be placed in the build directory. You need to then add a target that will run the convergence tests using lit.

might be useful to have the random SEED set as a cmake option: this way, if I see a failure in the CI, I can look at the configure, find the SEED definition, and then locally do something like:

cmake -DOFFLOAD_TEST_SUITE_SEED=1234 [all other llvm options] ninja -C build check-hlsl-reconvergence

Thanks for the suggestion, mentioned in the CMake Target section.

s-perron · 2026-01-30T14:56:31Z

proposals/0039-testing-maximal-reconvergence.md

+We may implment an environment variable `OFFLOADTEST_SUPPRESS_DIFF` to filter
+out some logs, since for example, diffs will be massive for a failing test.


When you run the lit command it will already give you a "CHECK" command that failed. We could carefully design the CHECK commands to be able to pinpoint a single value, and first value. With appropriate comments in the test file to help debugging.

I looked at the tests closer. Not all tests use filecheck. The ones you created did not.

Something else we can do it so make use arrays of resources. See the "array-global.test" test. Instead of having a single output buffer, the output buffer is an array, where each thread writes to its own buffer.

RWStructuredBuffer OutputB : register(u1);

becomes

RWStructuredBuffer OutputB[NUM_THREADS] : register(u1);

Then

OutputB[(outLoc++)*invocationStride + gIndex].x = 0x10002;

becomes

OutputB[gIndex][(outLoc++)].x = 0x10002;

Then you can have a CHECK line for each thread. The size of each line will be much smaller. I one test it would be only 100 values per line, and it if failed it would give a diff pointing to the incorrect value.

We could try to figure out a way to modify the indices. Instead of using outLoc++, we can use a formula with some constants so that we easily know which line and which iterations of any containing loop was suppose to have written to that location with the given loop iterations. This last part might be a bit forward looking. Not needed now. These are the types of things we can do to make the test easier to debug.

This idea sounds good, briefly described in "Reporting" section.

luciechoi · 2026-01-30T17:28:50Z

Add a section for adding XFAIL

luciechoi · 2026-02-17T17:35:13Z

Add Licensing section

s-perron

This LGTM, but a couple small changes.

proposals/0039-testing-maximal-reconvergence.md

s-perron · 2026-02-18T15:22:36Z

proposals/0039-testing-maximal-reconvergence.md

+We will implement a cmake target `check-hlsl-{platform}-reconvergence`, similar
+to the existing targets. Running this will generate the physical tests and run
+them.


Will we have a separate target used to generate the tests, then this target can depend on that one? That way the tests will not be regenerated every time the tests are run. Just the first time.

I agree, this is important. Especially because as far as I can tell the rule to make the generated tests will have few dependencies (just the test generation logic itself), so it should generally only need to run in a new build directory or clone of the repo.

Thank you for bringing this to attention, that is indeed the plan. Mentioned in the doc. Although I'm assuming for each pipeline run, we get a fresh container, so it does generate the tests on every run.

s-perron · 2026-02-18T15:24:29Z

proposals/0039-testing-maximal-reconvergence.md

+checks or implment an environment variable to filter out some logs.
+
+If any test fails, it will fail the workflow, so it's noticeable in the badge.
+`XFail` instructions will be added appropriately to suppress failures.


It is not clear how the XFail instructions will be added if the tests are generated. Some more details might be needed.

Thanks for pointing out, added some ideas.

XFail instructions will be added appropriately to suppress failures. Since it
is undesirable to change the code of the C++ random test generator every time
failure happens, the test generator may read a structured text file that
contains a list of failing tests and their environments. This way, only this
single file will be updated upon any changes in the compilers, and the algorithm
for generating the tests remains intact.

reconvergence-failing-tests.txt

reconvergence-test_2_16_7_13_3.test # Some comment # XFAIL: Clang && Vulkan # Some comment # XFAIL: ... reconvergence-test_5_32_7_13_1.test # Some comment # XFAIL: ...

bogner

This is good enough shape to go in, and any further work on the proposal can go in tree. Let's get it merged.

It would be good to get issues filed for followups, including ones about going into more detail about the XFAILs and any tooling we may or may not add for debugging failures.

proposals/0039-testing-maximal-reconvergence.md

bogner · 2026-02-18T23:13:34Z

proposals/0039-testing-maximal-reconvergence.md

+We will implement a cmake target `check-hlsl-{platform}-reconvergence`, similar
+to the existing targets. Running this will generate the physical tests and run
+them.


I agree, this is important. Especially because as far as I can tell the rule to make the generated tests will have few dependencies (just the test generation logic itself), so it should generally only need to run in a new build directory or clone of the repo.

proposals/0039-testing-maximal-reconvergence.md

Co-authored-by: Steven Perron <stevenperron@google.com> Co-authored-by: Justin Bogner <mail@justinbogner.com>

luciechoi · 2026-02-20T23:08:10Z

This is good enough shape to go in, and any further work on the proposal can go in tree. Let's get it merged.

It would be good to get issues filed for followups, including ones about going into more detail about the XFAILs and any tooling we may or may not add for debugging failures.

Thank you for your review!

luciechoi requested review from Keenuts and s-perron January 21, 2026 18:02

damyanp reviewed Jan 21, 2026

View reviewed changes

proposals/0039-testing-maximal-reconvergence.md Outdated Show resolved Hide resolved

damyanp reviewed Jan 21, 2026

View reviewed changes

proposals/0039-testing-maximal-reconvergence.md Outdated Show resolved Hide resolved

s-perron reviewed Jan 22, 2026

View reviewed changes

luciechoi requested a review from s-perron January 29, 2026 21:20

s-perron reviewed Jan 30, 2026

View reviewed changes

luciechoi requested a review from s-perron February 5, 2026 18:21

luciechoi force-pushed the reconvergence branch from 34f3a98 to 0bbf102 Compare February 12, 2026 20:25

s-perron reviewed Feb 18, 2026

View reviewed changes

bogner approved these changes Feb 18, 2026

View reviewed changes

luciechoi and others added 5 commits February 20, 2026 22:04

Initial proposal for testing-maximal-reconvergence

1e55d33

Address comments

55d1780

Address comments 2

35c5c93

Update RNG

1932bf8

Apply suggestions from code review

68e002b

Co-authored-by: Steven Perron <stevenperron@google.com> Co-authored-by: Justin Bogner <mail@justinbogner.com>

luciechoi changed the title ~~[NNNN] New proposal for testing-maximal-reconvergence~~ [0041] New proposal for testing-maximal-reconvergence Feb 20, 2026

luciechoi force-pushed the reconvergence branch from 5c0a45e to e1164dd Compare February 20, 2026 23:00

address comments

d4c3056

luciechoi force-pushed the reconvergence branch from e1164dd to d4c3056 Compare February 20, 2026 23:03

luciechoi requested a review from s-perron February 23, 2026 17:33

s-perron approved these changes Feb 26, 2026

View reviewed changes

s-perron merged commit 732cfef into llvm:main Feb 26, 2026


		Graphics compilers often perform aggressive optimizations that can unexpectedly alter the convergence behavior of threads within a wave. This is a critical issue for shaders containing operations dependent on control flow, such as wave intrinsics, as invalid transformations can lead to wrong or indeterminate results.

		Maximal reconvergence is a set of compiler guarantees designed to prevent these unintended changes, ensuring that divergent threads reconverge at expected merge points and that wave operations execute in lockstep where intended.


		Logic from [Vulkan CTS GLSL generation](https://github.com/KhronosGroup/VK-GL-CTS/blob/main/external/vulkancts/modules/vulkan/reconvergence/vktReconvergenceTests.cpp) will be ported to produce HLSL. This includes translating intrinsics such as `subgroupElect()` to `WaveIsFirstLane()` and `subgroupBallot()` to `WaveActiveBallot()`, etc.

		### Execution Pipeline


		### Reporting

		Results of the reconvergence tests will be aggregated. Failing shaders will be logged separately or made available via YAML artifacts to avoid diluting logs with excessive data.

		We may implment an environment variable `OFFLOADTEST_SUPPRESS_DIFF` to filter
		out some logs, since for example, diffs will be massive for a failing test.

Conversation

luciechoi commented Jan 21, 2026

Uh oh!

Uh oh!

Uh oh!

s-perron left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

luciechoi Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

s-perron left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Test Generation

Random shaders

Expected results

Final test file

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

luciechoi commented Jan 30, 2026

Uh oh!

luciechoi commented Feb 17, 2026

Uh oh!

s-perron left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

luciechoi Jan 29, 2026 •

edited

Loading