api-query: Continue device initialization when a backend fails by MarijnS95 · Pull Request #1032 · llvm/offload-test-suite

MarijnS95 · 2026-03-30T13:11:18Z

Device::initialize() returns early when any backend fails to initialize, preventing subsequent backends from being discovered. For example, a Vulkan initialization failure (e.g. VK_ERROR_INCOMPATIBLE_DRIVER when MoltenVK is not installed) blocks Metal device discovery on macOS.

Collect backend initialization errors with joinErrors() and return them together instead. Also make lit.cfg.py resilient to an empty device list from api-query.

Test plan

Verified api-query outputs Metal device when Vulkan init fails (no MoltenVK)
Verified check-hlsl runs Metal tests successfully with this change

🤖 Generated with Claude Code

test/lit.cfg.py

lib/API/Device.cpp

manon-traverse · 2026-04-01T11:42:50Z

LGTM

bogner · 2026-04-06T23:00:56Z

A few things:

This has conflicts due to Restructure how devices are created and automatically cleaned up. #1035 and needs to be updated
Wouldn't we also need a change in api-query.cpp to not abort after calling initializeDevices? This might be something that changed since Restructure how devices are created and automatically cleaned up. #1035
The commit message here could be significantly simplified, which would help make it clear what this does. Commit messages shouldn't repeat every change that a commit will make, but instead focus on pointing out why the change is being made. Here, it would suffice to point out the problem (we quit after each device) and the high level of how we're fixing it (collect the errors, report them, and continue on).

MarijnS95 · 2026-04-07T06:39:59Z

Wouldn't we also need a change in api-query.cpp to not abort after calling initializeDevices? This might be something that changed since Restructure how devices are created and automatically cleaned up. #1035

Yup, the original claim that Device::initialize() doesn't bail on error no longer holds true, so the flow had to be rewritten slightly to still print errors but not return them out of initializeDevices().

The commit message here could be significantly simplified, which would help make it clear what this does. Commit messages shouldn't repeat every change that a commit will make, but instead focus on pointing out why the change is being made. Here, it would suffice to point out the problem (we quit after each device) and the high level of how we're fixing it (collect the errors, report them, and continue on).

@bogner just to make sure we're talking about the same commit message:

Device::initialize() returns early when any backend fails to initialize, preventing subsequent backends from being discovered. For example, a Vulkan initialization failure (e.g. VK_ERROR_INCOMPATIBLE_DRIVER when MoltenVK is not installed) blocks Metal device discovery on macOS.

Collect backend initialization errors with joinErrors() and return them together instead. Also make lit.cfg.py resilient to an empty device list from api-query.

This explains why something is wrong together with an example case of when that has an effect (Vulkan init on MacOS), followed by how it was fixed (using joinError(), and the drive-by note aboutlit.cfg.py not allowing empty yaml lists initially).

Is that still too much?

bogner

LGTM

bogner · 2026-04-08T17:51:23Z

lib/API/Device.cpp

 llvm::Expected<llvm::SmallVector<std::unique_ptr<Device>>>
 offloadtest::initializeDevices(const DeviceConfig Config) {
  llvm::SmallVector<std::unique_ptr<Device>> Devices;
+  llvm::Error Result = llvm::Error::success();


I find the name Result a little confusing here, since it's just the error case of the result. Maybe Err would be a better name for this.

The reason Err isn't chosen is because those are already used inside the if blocks to join into this Result. Perhaps we can rename those to E to allow this to become Err?

bogner · 2026-04-08T17:54:41Z

@bogner just to make sure we're talking about the same commit message:

Device::initialize() returns early when any backend fails to initialize, preventing subsequent backends from being discovered. For example, a Vulkan initialization failure (e.g. VK_ERROR_INCOMPATIBLE_DRIVER when MoltenVK is not installed) blocks Metal device discovery on macOS.
Collect backend initialization errors with joinErrors() and return them together instead. Also make lit.cfg.py resilient to an empty device list from api-query.

This explains why something is wrong together with an example case of when that has an effect (Vulkan init on MacOS), followed by how it was fixed (using joinError(), and the drive-by note aboutlit.cfg.py not allowing empty yaml lists initially).

Is that still too much?

I was looking in the wrong place (this was before our other conversation about PR desciptions matching commit messages). This does look much better. I do still find this a little verbose where it goes into listing the specific APIs that its using, but that's mostly a matter of taste.

Device::initialize() returns early when any backend fails to initialize, preventing subsequent backends from being discovered. For example, a Vulkan initialization failure (e.g. VK_ERROR_INCOMPATIBLE_DRIVER when MoltenVK is not installed) blocks Metal device discovery on macOS. Collect backend initialization errors with joinErrors() and return them together instead. Also make lit.cfg.py resilient to an empty device list from api-query. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

MarijnS95 · 2026-04-09T07:43:29Z

Would dropping with joinErrors() solve that? It doesn't seem to mention any other APIs being used, only APIs being changed (Device::initialize(), lit.cfg.py, and the example about VK_ERROR_INCOMPATIBLE_DRIVER being triggered without MoltenVK).

MarijnS95 commented Mar 30, 2026

View reviewed changes

test/lit.cfg.py Outdated Show resolved Hide resolved

MarijnS95 force-pushed the api-query-continue-on-backend-init-failure branch from f4cbd80 to de099b0 Compare March 30, 2026 13:14

llvm-beanz reviewed Mar 30, 2026

View reviewed changes

lib/API/Device.cpp Outdated Show resolved Hide resolved

MarijnS95 force-pushed the api-query-continue-on-backend-init-failure branch from de099b0 to d9b6e6a Compare March 30, 2026 16:35

Icohedron approved these changes Mar 30, 2026

View reviewed changes

MarijnS95 requested a review from llvm-beanz March 31, 2026 07:51

MarijnS95 force-pushed the api-query-continue-on-backend-init-failure branch from d9b6e6a to 0fe8047 Compare April 2, 2026 16:17

llvm-beanz approved these changes Apr 3, 2026

View reviewed changes

MarijnS95 force-pushed the api-query-continue-on-backend-init-failure branch from 0fe8047 to 1d5330c Compare April 7, 2026 06:35

MarijnS95 force-pushed the api-query-continue-on-backend-init-failure branch 2 times, most recently from 8a379ad to 68112e4 Compare April 8, 2026 14:31

bogner approved these changes Apr 8, 2026

View reviewed changes

MarijnS95 force-pushed the api-query-continue-on-backend-init-failure branch from 68112e4 to 53ead32 Compare April 9, 2026 07:44

manon-traverse approved these changes Apr 9, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

api-query: Continue device initialization when a backend fails#1032

api-query: Continue device initialization when a backend fails#1032
MarijnS95 wants to merge 1 commit intollvm:mainfrom
Traverse-Research:api-query-continue-on-backend-init-failure

MarijnS95 commented Mar 30, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

manon-traverse commented Apr 1, 2026

Uh oh!

bogner commented Apr 6, 2026

Uh oh!

MarijnS95 commented Apr 7, 2026 •

edited

Loading

Uh oh!

bogner left a comment

Uh oh!

bogner Apr 8, 2026

Uh oh!

MarijnS95 Apr 9, 2026

Uh oh!

bogner commented Apr 8, 2026

Uh oh!

MarijnS95 commented Apr 9, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

MarijnS95 commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test plan

Uh oh!

Uh oh!

Uh oh!

manon-traverse commented Apr 1, 2026

Uh oh!

bogner commented Apr 6, 2026

Uh oh!

MarijnS95 commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bogner left a comment

Choose a reason for hiding this comment

Uh oh!

bogner Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

MarijnS95 Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

bogner commented Apr 8, 2026

Uh oh!

MarijnS95 commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

MarijnS95 commented Mar 30, 2026 •

edited

Loading

MarijnS95 commented Apr 7, 2026 •

edited

Loading

MarijnS95 commented Apr 9, 2026 •

edited

Loading