Skip to content

Pass to load-store eliminate OpLoad OpLogicalCopy OpCompositeExtract code generated due to explicit layout #6611

@devshgraphicsprogramming

Description

We need an optimization pass that can hoist OpCompositeExtract before the OpLogicalCopy if the composite extraction is happening on a ResultID of an OpLogicalCopy of which the OpCompositeExtract is the only consumer.

Then we also need the hoisting and conversion of OpCompositeExtract into OpAccessChain if after hoisting it has the same relationship with an OpLoad.

This problem wouldn't exist if all Storage Classes supported explicit layout as we talked about in the Shading Language Symposium 2026
https://www.khronos.org/developers/linkto/the-vulkan-spir-v-and-its-fundamental-limits

However for the time being the problem is that structs don't support explicit layouts in every storage class, and especially the Private/Function one.

And the SSA ResultID inherit their type from the OpLoad they were used with, so an SSA of a variable loaded from SSBO/BDA will have one type, and others another depending on layout or the lack thereof.

This means that compilers do vastly different codegen for

vk::BufferPointer<SurfaceData>(bda_data + surfaceDataByteOffset).Get().someParameter;

vs

SurfaceData loadSurfaceData( uint offset)
{
    return vk::BufferPointer<SurfaceData>(bda_data + offset).Get();
};

...

loadSurfaceData(surfaceDataByteOffset).someParameter;

the first one results in the correct OpAccessChain and I'm not at the mercy of the IHV's ISA compiler to only load the field I'm using.

The second one results in an OpLoad of the entire struct, then OpCompositeExtract to get the field I'm after.

The reason is simple, a temporary SSA ResultID is emitted by calling the function.

One could argue that a "proper compiler" and not something like DXC where it generates straight from AST as-is should clean this up with their highest optimization level, but I have a feeling that many smaller compilers will default to SPIR-V Tools or LLVM to perform their optimizations because otherwise they'd have to invent their own IR for this.

I've tried --eliminate-local-single-block on the output SPIR-V and nothing

I'm trying mixtures of --eliminate-local-single-store --eliminate-local-multi-store --eliminate-local-single-block

--reduce-load-size doesn't work either

So I'm pretty sure there isn't a pass or mixture of passes which can achieve this in current SPIR-V opt.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions