From 47e4a816124fe0cfd3c90baae0606f35f17f300e Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Nathan=20Gau=C3=ABr?= <brioche@google.com>
Date: Wed, 18 Jun 2025 14:47:37 +0200
Subject: [PATCH 1/9] Proposal for input semantics

This is another proposal on how to implement semantic input in Clang,
given DXIL & SPIR-V have drasticly different handlings, but some parts
could be shared.

Another proposal exists: #112 which also suggest a sema change.
---
 proposals/NNNN-input-semantics.md | 245 ++++++++++++++++++++++++++++++
 1 file changed, 245 insertions(+)
 create mode 100644 proposals/NNNN-input-semantics.md

diff --git a/proposals/NNNN-input-semantics.md b/proposals/NNNN-input-semantics.md
new file mode 100644
index 00000000..52ed3ebc
--- /dev/null
+++ b/proposals/NNNN-input-semantics.md
@@ -0,0 +1,245 @@
+# Input semantics
+
+* Proposal: [NNNN](http://NNNN-input-semantics.md)
+* Author(s): [Nathan Gauër](https://github.com/Keenuts)
+* Status: **Design In Progress**
+
+## Introduction
+
+HLSL shaders can read form the pipeline state using [semantics](https://learn.microsoft.com/en-us/windows/win32/direct3dhlsl/dx-graphics-hlsl-semantics).
+This proposal looks into how to implement input semantics in Clang.
+Output semantics are out of scope of this proposal, but some parts will be
+similar.
+
+## Motivation
+
+HLSL input semantics are a core part of the shading language.
+
+## Behavior
+
+### HLSL
+
+Input semantics are used by the API to determine how to connect pipeline
+data to the shader.
+
+Input semantic attributes can be applied to a function parameter, or a struct
+field's declaration. They only carry meaning when used on an entry point
+parameter, or a struct type used by one of the entry point parameters.
+All other uses are simply ignored.
+
+HLSL has two kinds of semantics: System and User.
+System semantics are linked to specific parts of the pipeline, while
+user semantics are just a way for the user to link the output of a stage
+to the input of another stage.
+
+When the semantic attribute is applied to a struct (type or value), it applies
+recursively to each inner fields, shadowing any other semantic.
+
+Each scalar in the entrypoint must have a semantic attribute attached, either
+directly or inherited from the parent struct type.
+
+Example:
+
+```hlsl
+struct B {
+  int b1 : SB1;
+  int b2 : SB2;
+};
+
+struct C {
+  int c1 : SC1;
+  int c2 : SC2;
+};
+
+struct D {
+  int d1;
+};
+
+struct E {
+  int e1 : EC;
+  int e2 : EC;
+};
+
+struct F {
+  int f1 : FC;
+  int f2 : FC;
+};
+
+[[shader("pixel")]]
+void main(float a : SA, B b : SB, C c, D d, E e, F f : SF) { }
+```
+
+In this example:
+- `a` is linked to the semantic `SA`.
+- `b.b1` and `b.b2` are linked to the semantic `SB` because `SB` shadows the
+  semantics attached to each field.
+- `c.c1` has the semantic `SC1`, and `c.c2` the semantic `SC2`.
+- `d.d1`, hence `d`, is illegal: no semantic is attached to `d.d1`.
+- `e.e1` and `e.e2` are invalid: `EC` usage is duplicated without being inherited.
+- `f.f1` and `f.f2` semantic is `SF`, shadowing the duplicated `FC` semantic.
+
+**Note**: HLSL forbids explicit **non-shadowed** semantic duplication. In this
+sample, the parameter `e` uses `E`, which explicitly declares two fields with
+the same semantic. This is illegal. \
+`b` has the semantic `SB` applied on the whole struct. Meaning all its fields
+share the same semantic `SB`. This is legal because the duplication comes
+from inheritance.
+Lastly, `f` explicitly duplicates the semantic `FC`. But because those are
+shadowed by the semantic `SF`, this is valid HLSL.
+
+**Note**: Implicit semantic duplication is allowed for user semantics, but
+always forbidden for system semantics.
+
+### SPIR-V
+
+On the SPIR-V side, user semantics are translated into `Location`
+decorated `Input` variables. The `Location` decoration takes an index.
+System semantics are either translated to `Location` decorated `Input`
+variables, or `BuiltIn` decorated `Input` variables.
+
+In the example above, there are no system semantics, meaning every
+parameter would get a `Location` decorated variable associated.
+Each scalar field/parameter is associated with a unique index starting at 0,
+from the first parameter's field to the last parameter's field.
+
+In the sample above:
+
+- `a` would have the `Location 0`.
+- `b.b1` would have the `Location 1`.
+- `b.b2` would have the `Location 2`.
+- `c.c1` would have the `Location 3`.
+- ...
+
+It is also possible to explicitly set the index, using the
+`[[vk::location(/* Index */)]]` attribute. \
+Mixing implicit and explicit location assignment is **not legal**.
+Hence interaction between both mechanism is out of scope.
+
+### DXIL
+
+On the DXIL side, some system semantics are translated to builtin function
+calls. But most are visible along user semantics in the `input signature`.
+
+To pass data between stages, DirectX provides a fixed list of <4 x 32bit>
+registers. E.g: 32 x <4 x 32 bit> for VS in D3D11.
+
+The input signature assigns to each semantic a `Name`, `Index`, `Mask`,
+`Register`, `SysValue`, and `Format`.
+
+- `Name`: the semantic attribute name.
+- `Index`: used to differentiate two inputs sharing the same `Name`.
+- `Register`: determines which 16-byte register the value shall be read from.
+- `Mask`:what 32-bit part is used for this value, e.g: `xyz` or `x`.
+- `Format`: how to interpret the data, e.g. `float` or `int`.
+- `SysValue`: `NONE` for user semantic, a known string for system semantics.
+
+Unlike in SPIR-V, the `Register` and `Mask` cannot be simply deduced from the
+iteration order. Those value depends on the [packing rules](https://github.com/Microsoft/DirectXShaderCompiler/blob/main/docs/DXIL.rst#signature-packing)
+of the inputs.
+
+## Proposal
+
+SPIR-V and DXIL semantic lowering is very different. SPIR-V respects the
+parameters/field order, while DXIL packs by following an extensive set of
+rules. The System vs User semantics handling is also divergent.
+
+This means we will be able to share the Sema checks, but will have to build
+two distinct paths during codegen.
+
+### Sema
+
+Type check for semantics is currently handled in `SemaDeclAttr.cpp`. \
+Iteration is done on each declaration, and if a semantic attribute is present,
+the type is checked.
+Each declaration being handled independently, this method does not support
+inherited/shadowed semantic attributes.
+
+Sema checks are divided into three parts:
+ - check for type compatibility between the variable and the semantic.
+ - check for semantic duplication.
+ - check for invalid system semantic usage depending on shader stage.
+
+Proposition is to remove the type validation from the `SemaDeclAttr` and
+move it later into SemaHLSL, with the other checks.
+Idea is we need to have the inheritance rules to check types, so we should
+avoid duplicating this logic in two places.
+
+The pseudo-code for this check should be:
+
+```cpp
+  void checkSemantic(std::unordered_set<HLSLAnnotationAttr> &UsedSemantics,
+                DeclaratorDecl *Decl,
+                HLSLAnnotationAttr *InheritedSemantic = nullptr) {
+
+    HLSLAnnotationAttr *Semantic = InheritedSemantic ? InheritedSemantic : Decl->get<HLSLAnnotationAttr>();
+    RecordDecl *RD = dyn_cast<RecordDecl>(Decl);
+
+    // Case 1: type is a scalar, and we have a semantic. End case.
+    if (Semantic && !RD) {
+      if (UsedSemantics.contains(Semantic) && !InheritedSemantic)
+        Fail("Explicit semantic duplication", Decl->getLocation())
+
+      UsedSemantics.insert(Semantic)
+      diagnoseSemanticType(Decl, Semantic);
+      diagnoseSemanticEnvironment(Decl, Context.ShaderEnv);
+      return;
+    }
+
+    // Case 2: type is scalar, but we have no semantic: error
+    if (!RD)
+      Fail("Missing semantic", Decl->getLocation());
+
+    // Case 3: it's a struct. Simply recurse, optionnally inherit semantic.
+    if (RecordDecl *RD = dyn_cast<RecordDecl>(Decl)) {
+      for (FieldDecl *FD : Decl->asRecordDecl()->getFields())
+        checkSemantic(UsedSemantics, FD, Semantic);
+    }
+  }
+
+  ...
+
+  std::unordered_set<clang::HLSLAnnotationAttr> UsedSemantics;
+  for (ParmVarDecl Decl : entrypoint->getParams()) {
+    checkSemantic(UsedSemantics, Decl, /* InheritedSemantic= */ nullptr);
+  }
+```
+
+At this point, we are guaranteed to have only valid and unique semantics, as
+well as valid types.
+
+### CodeGen
+
+DXIL and SPIR-V codegen will be very different, but the flattening/inheritance
+bit can be shared.
+
+The proposal is to provide a sorted list in `CGHLSLRuntime`:
+
+```cpp
+struct SemanticIO {
+  // The active semantic for this scalar/field.
+  HLSLAnnotationAttr *Semantic;
+
+  // Info about this field/scalar.
+  DeclaratorDecl *Decl;
+  llvm::Type *Type;
+
+  // The loaded value in the wrapper for this scalar/field. Each target
+  // must provide this value.
+  llvm::Value *Value = nullptr;
+};
+
+Vector<SemanticIO> InputSemantics;
+```
+
+The proposal is to let each target implement the logic in CGHLSLRuntime to
+load the semantics, and set the `Value` field of the `SemanticIO` struct.
+The order of the vector represents the flattening order.
+
+Providing the full list should allow DXIL to easily implement packing rules.
+
+At this stage, `CGHLSLRuntime` will have an ordered list of `SemanticIO`
+structs. The order is from the first parameter/field to the last
+parameter/field (DFS), and each will point to an `llvm::Value`.
+
+The common code will then build the arguments list for the entrypoint call
+using the provided `llvm::Value`. This is shared across targets.

From af9b43957decfde6e79486702c7180201e326ae6 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Nathan=20Gau=C3=ABr?= <brioche@google.com>
Date: Mon, 30 Jun 2025 18:22:20 +0200
Subject: [PATCH 2/9] rewrite wip

---
 proposals/NNNN-input-semantics.md | 393 ++++++++++++++++++++++--------
 1 file changed, 285 insertions(+), 108 deletions(-)

diff --git a/proposals/NNNN-input-semantics.md b/proposals/NNNN-input-semantics.md
index 52ed3ebc..81d4dc8e 100644
--- a/proposals/NNNN-input-semantics.md
+++ b/proposals/NNNN-input-semantics.md
@@ -1,4 +1,4 @@
-# Input semantics
+# HLSL shader semantics
 
 * Proposal: [NNNN](http://NNNN-input-semantics.md)
 * Author(s): [Nathan Gauër](https://github.com/Keenuts)
@@ -6,109 +6,228 @@
 
 ## Introduction
 
-HLSL shaders can read form the pipeline state using [semantics](https://learn.microsoft.com/en-us/windows/win32/direct3dhlsl/dx-graphics-hlsl-semantics).
-This proposal looks into how to implement input semantics in Clang.
-Output semantics are out of scope of this proposal, but some parts will be
-similar.
+HLSL shaders can read/write form/to the pipeline state using [semantics](https://learn.microsoft.com/en-us/windows/win32/direct3dhlsl/dx-graphics-hlsl-semantics).
+This proposal looks into how to implement semantics in Clang.
 
 ## Motivation
 
-HLSL input semantics are a core part of the shading language.
+HLSL shader semantics are a core part of the shading language.
 
 ## Behavior
 
 ### HLSL
 
-Input semantics are used by the API to determine how to connect pipeline
-data to the shader.
-
-Input semantic attributes can be applied to a function parameter, or a struct
-field's declaration. They only carry meaning when used on an entry point
-parameter, or a struct type used by one of the entry point parameters.
-All other uses are simply ignored.
-
-HLSL has two kinds of semantics: System and User.
+Shader semantics are used by the API to determine how to connect pipeline
+data to the shader. HLSL has two kinds of semantics: System and User.
 System semantics are linked to specific parts of the pipeline, while
 user semantics are just a way for the user to link the output of a stage
 to the input of another stage.
 
-When the semantic attribute is applied to a struct (type or value), it applies
-recursively to each inner fields, shadowing any other semantic.
+
+Shader semantic attributes can be applied to:
+  - a function parameter
+  - function return value
+  - a struct field's declaration
+
+They only carry meaning when used on:
+  - an entry point parameter
+  - entrypoint return value
+  - a struct type used by an entry point parameter or return value.
+All other uses are simply ignored.
+
+When a semantic is applied to both a parameter/return value, and its type,
+the parameter/return value semantic applies, and the type's semantics are
+ignored.
 
 Each scalar in the entrypoint must have a semantic attribute attached, either
 directly or inherited from the parent struct type.
 
-Example:
+When a semantic applies to an array type, each element of the array
+is considered to take one index space in the semantic.
+When a semantic applies to a struct, each scalar field (recursively) takes
+one index in the semantic.
+
+Shader semantics on function return value are output semantics.
+When applying to an entrypoint parameter, the `in`/`out`/`inout` qualifiers
+will determine if this is an input or output semantic.
+
+Each semantic is *usually* composed of two elements:
+ - a case insensitive name
+ - an index
+
+Any semantic starting with the `SV_` prefix is considered to be a system
+semantic. Some system semantics do not have indices, while other have. All
+user semantics have an index (implicit or explicit).
+
+The index can be either implicit (equal to 0), or explicit:
+
+ - `Semantic`, Name = SEMANTIC, Index = 0
+ - `SEMANTIC0`, Name = SEMANTIC, Index = 0
+ - `semANtIC12`, Name = SEMANTIC, Index = 12
+
+
+The HLSL language does not impose restriction on the indices except it has to
+be a positive integer or zero. Target APIs can apply additional restrictions.
+The same semantic (name+index) must only appear once on the inputs, and once
+on the outputs.
+
+Each stage has a fixed list of allowed system semantics for its
+inputs or outputs. If user semantics are allowed as input or output semantics
+depends on the shader stage being
+targeted.
+
+Examples:
 
 ```hlsl
-struct B {
-  int b1 : SB1;
-  int b2 : SB2;
+float main(float a : A) : B {}
+// a : A0
+// main() : B0
+```
+
+```hlsl
+struct S {
+  int f1 : A;
+  int f2 : B;
 };
 
+void main(S s) {}
+// s.f1 : A0
+// s.f2 : B0
+```
+
+```hlsl
+struct S {
+  int f1;
+  int f2;
+};
+
+void main(S s : A) {}
+// s.f1 : A0
+// s.f2 : A1
+```
+
+```hlsl
+struct S {
+  int f1 : B0;
+  int f2 : C0;
+};
+
+void main(S s : A) {}
+// s.f1 : A0
+// s.f2 : A1
+```
+
+```hlsl
 struct C {
-  int c1 : SC1;
-  int c2 : SC2;
+  int c1;
+  int c2;
 };
 
-struct D {
-  int d1;
+struct S {
+  int f1;
+  C   f2;
+  int f3;
 };
 
-struct E {
-  int e1 : EC;
-  int e2 : EC;
+void main(S s : A) {}
+// s.f1    : A0
+// s.f2.c1 : A1
+// s.f2.c2 : A2
+// s.f3    : A3
+```
+
+```hlsl
+struct C {
+  int c1;
+  int c2;
 };
 
-struct F {
-  int f1 : FC;
-  int f2 : FC;
+struct S {
+  int f1[2];
+  C   f2;
+  int f3;
 };
 
-[[shader("pixel")]]
-void main(float a : SA, B b : SB, C c, D d, E e, F f : SF) { }
+void main(S s : A) {}
+// s.f1[0]    : A0
+// index takes by the second element in s.f1[]
+// s.f2.c1    : A2
+// s.f2.c2    : A3
+// s.f3       : A4
 ```
 
-In this example:
-- `a` is linked to the semantic `SA`.
-- `b.b1` and `b.b2` are linked to the semantic `SB` because `SB` shadows the
-  semantics attached to each field.
-- `c.c1` has the semantic `SC1`, and `c.c2` the semantic `SC2`.
-- `d.d1`, hence `d`, is illegal: no semantic is attached to `d.d1`.
-- `e.e1` and `e.e2` are invalid: `EC` usage is duplicated without being inherited.
-- `f.f1` and `f.f2` semantic is `SF`, shadowing the duplicated `FC` semantic.
+```hlsl
+void main(int a : A0, int b : A) {}
+// Illegal: Semantic A0 is used twice (A0 and A, implicit A0).
+```
 
-**Note**: HLSL forbids explicit **non-shadowed** semantic duplication. In this
-sample, the parameter `e` uses `E`, which explicitly declares two fields with
-the same semantic. This is illegal. \
-`b` has the semantic `SB` applied on the whole struct. Meaning all its fields
-share the same semantic `SB`. This is legal because the duplication comes
-from inheritance.
-Lastly, `f` explicitly duplicates the semantic `FC`. But because those are
-shadowed by the semantic `SF`, this is valid HLSL.
+```hlsl
+struct S {
+  int f1 : A0;
+};
+
+void main(S s, int b : A0) {}
+// Illegal: A0 is used twice, S.f1 and b.
+```
 
-**Note**: Implicit semantic duplication is allowed for user semantics, but
-always forbidden for system semantics.
+```hlsl
+struct S {
+  int f1 : A0;
+};
+
+void main(S s : A1, int b : A0) {}
+// s.f1 : A1, b : A0, this is legal.
+```
+
+```hlsl
+struct S {
+  int f1[2] : A0;
+};
+
+void main(S s, int b : A1) {}
+// Illegal: s.f1[] is an array of 2 elements. Semantic is A0, but A1 is taken
+// by the second element. Meaning there is a semantic overlap for A1.
+```
+
+```hlsl
+void main(float4 a : POSITION0, out float4 b : POSITION0);
+// a : POSITION0
+// b : POSITION 0
+// Legal: The semantic appears only once per input, and once per output.
+```
 
 ### SPIR-V
 
 On the SPIR-V side, user semantics are translated into `Location`
-decorated `Input` variables. The `Location` decoration takes an index.
-System semantics are either translated to `Location` decorated `Input`
-variables, or `BuiltIn` decorated `Input` variables.
+decorated `Input` or `Output` variables. The `Location` decoration takes an
+index.
+System semantics are either translated to `Location` or `BuiltIn` decorated
+variables depending on the stage and semantic.
 
 In the example above, there are no system semantics, meaning every
 parameter would get a `Location` decorated variable associated.
 Each scalar field/parameter is associated with a unique index starting at 0,
 from the first parameter's field to the last parameter's field.
+Each scalar takes one index, and arrays of size N takes N indices.
+The semantic index does not impact the Location assignment.
+Indices are independent between `Input` and `Output` semantics.
+
+Example:
 
-In the sample above:
+```hlsl
+```hlsl
+struct S {
+  int f1[2] : A5;
+};
 
-- `a` would have the `Location 0`.
-- `b.b1` would have the `Location 1`.
-- `b.b2` would have the `Location 2`.
-- `c.c1` would have the `Location 3`.
-- ...
+void main(S s, out int c : A2, int b : A0, out int d : A0) {}
+// s.f1 : A5 -> Location 0
+// b    : A0 -> Location 2
+// c    : A2 -> Location 0
+// d    : A0 -> Location 1
+// Semantic index does not contribute, only the parameter ordering does.
+// Input and Outputs are sorted independently.
+```
 
 It is also possible to explicitly set the index, using the
 `[[vk::location(/* Index */)]]` attribute. \
@@ -146,73 +265,94 @@ rules. The System vs User semantics handling is also divergent.
 This means we will be able to share the Sema checks, but will have to build
 two distinct paths during codegen.
 
-### Sema
+Also, the validity of the semantic attribute depends on its usage, meaning
+we to facilitate code-reuse, some validation will have to be deferred to
+CodeGen.
 
-Type check for semantics is currently handled in `SemaDeclAttr.cpp`. \
-Iteration is done on each declaration, and if a semantic attribute is present,
-the type is checked.
-Each declaration being handled independently, this method does not support
-inherited/shadowed semantic attributes.
+### Parser
 
-Sema checks are divided into three parts:
- - check for type compatibility between the variable and the semantic.
- - check for semantic duplication.
- - check for invalid system semantic usage depending on shader stage.
+Clang has a built-in mechanism for attribute parsing using a `.td` file.
+But this requires enumerating the list of known semantics, which is not
+possible for user semantics.
 
-Proposition is to remove the type validation from the `SemaDeclAttr` and
-move it later into SemaHLSL, with the other checks.
-Idea is we need to have the inheritance rules to check types, so we should
-avoid duplicating this logic in two places.
+The attribute `.td` file must be changed to allow syntax-free semantics
+to be parsed.
+**NOTE**: All Spellings will be removed on already checked-in semantics.
 
-The pseudo-code for this check should be:
+```
+class HLSLSemanticAttr : HLSLAnnotationAttr;
+
+def HLSLUnparsedSemantic : HLSLSemanticAttr {
+  let Spellings = [];
+  let Args = [];
+  let Subjects = SubjectList<[ParmVar, Field, Function]>;
+  let LangOpts = [HLSL];
+  let Documentation = [InternalOnly];
+}
+
+def HLSLUserSemantic : HLSLSemanticAttr {
+  let Spellings = [];
+  let Args = [DefaultIntArgument<"Location", 0>, DefaultBoolArgument<"Implicit", 0>];
+  let Subjects = SubjectList<[ParmVar, Field, Function]>;
+  let LangOpts = [HLSL];
+  let Documentation = [InternalOnly];
+}
+
+def HLSLSV_GroupThreadID: HLSLSemanticAttr {
+  let Spellings = [];
+```
 
-```cpp
-  void checkSemantic(std::unordered_set<HLSLAnnotationAttr> &UsedSemantics,
-                DeclaratorDecl *Decl,
-                HLSLAnnotationAttr *InheritedSemantic = nullptr) {
+During an attribute parsing, we first assign the `HLSLUnparsedSemantic` kind
+to any HLSL semantic-like notation.
+
+When, when this attribute kind is parsed, we rely on Sema to emit the correct
+`HLSLUserSemanticAttr` or `HLSLSV_*Attr`, etc.
 
-    HLSLAnnotationAttr *Semantic = InheritedSemantic ? InheritedSemantic : Decl->get<HLSLAnnotationAttr>();
-    RecordDecl *RD = dyn_cast<RecordDecl>(Decl);
+Sema will do stateless checks like:
+  - Is this system semantic compatible with this shader stage?
+  - Is this system semantic compatible with the type it's associated with?
 
-    // Case 1: type is a scalar, and we have a semantic. End case.
-    if (Semantic && !RD) {
-      if (UsedSemantics.contains(Semantic) && !InheritedSemantic)
-        Fail("Explicit semantic duplication", Decl->getLocation())
+We must also consider `MY_SEMANTIC0` to be equal to `MY_semantic`.
+The solution is to modify the `ParseHLSLAnnotations` function to add a custom
+attribute parsing function.
 
-      UsedSemantics.insert(Semantic)
-      diagnoseSemanticType(Decl, Semantic);
-      diagnoseSemanticEnvironment(Decl, Context.ShaderEnv);
-      return;
-    }
+```cpp
+struct ParsedSemantic {
+  // The normalized name of the semantic without index.
+  StringRef Name;
+  // The index of the semantic. 0 if implicit.
+  unsigned Index;
+  // Was the index explicit in the name or not.
+  bool Implicit;
+};
 
-    // Case 2: type is scalar, but we have no semantic: error
-    if (!RD)
-      Fail("Missing semantic", Decl->getLocation());
+Parser::ParsedSemantic Parser::ParseHLSLSemantic();
+```
 
-    // Case 3: it's a struct. Simply recurse, optionnally inherit semantic.
-    if (RecordDecl *RD = dyn_cast<RecordDecl>(Decl)) {
-      for (FieldDecl *FD : Decl->asRecordDecl()->getFields())
-        checkSemantic(UsedSemantics, FD, Semantic);
-    }
-  }
+### Sema
 
-  ...
+Sema check is only stateless, and done during parsing.
+The parser first emits an `HLSLUnparsedSemantic` attribute, which is passed
+down to Sema.
+Sema goal is to:
+ - convert this class into a valid semantic class like `HLSLUserSemantic` or `HLSLSV_*`.
+ - check shader stage compatibility with the system semantic.
+ - check type compability with the system semantic.
 
-  std::unordered_set<clang::HLSLAnnotationAttr> UsedSemantics;
-  for (ParmVarDecl Decl : entrypoint->getParams()) {
-    checkSemantic(UsedSemantics, Decl, /* InheritedSemantic= */ nullptr);
-  }
-```
+At this stage, we have either `HLSLUserSemanticAttr` attributes, or known
+compatible `HLSLSV_*Attr` attributes.
+All non-converted `HLSLUnparsedSemantic` would have raised a diagnostic to
+say `unknown HLSL system semantic X`.
 
-At this point, we are guaranteed to have only valid and unique semantics, as
-well as valid types.
+No further checking is done as we must wait for codegen to move forward.
 
 ### CodeGen
 
 DXIL and SPIR-V codegen will be very different, but the flattening/inheritance
 bit can be shared.
 
-The proposal is to provide a sorted list in `CGHLSLRuntime`:
+The proposal is to provide a sorted list of valid semantics in `CGHLSLRuntime`,
+and then we can have a per-backend implementation for index assignment.
 
 ```cpp
 struct SemanticIO {
@@ -231,6 +371,36 @@ struct SemanticIO {
 Vector<SemanticIO> InputSemantics;
 ```
 
+The pseudo code would be as follows
+
+```python
+
+  def emitEntryFunction():
+    # Current emitEntryFunction code in CGHLSLRuntime.cpp.
+    ...
+
+    semantics = {}
+
+    foreach (Parameter, ReturnValue) in FunctionDecl:
+      if item is output:
+        # output parameters.
+        var = createLocalVar(item)
+        semantics[item.semantic_name] = getPointerTo(var)
+      elif item is byval:
+        # Semantic structs passed as input.
+        var = createLocalVar(item)
+        loadSemanticRecursivelyToVariable(item, var)
+        semantics[item.semantic_name] = getPointerTo(var)
+      elif item is input:
+        semantics[item.semantic_name] = loadSemanticRecursively(item)
+
+
+    Args = [ semantics[x->semantic_name] for x in AllParameters ]
+    call = createCall(MainFunction, Args)
+    if not call->isVoid():
+      StoreOutputSemanticRecursively(call, semantics)
+```
+
 The proposal is to let each target implement the logic in CGHLSLRuntime to
 load the semantics, and set the `Value` field of the `SemanticIO` struct.
 The order of the vector represents the flattening order.
@@ -243,3 +413,10 @@ parameter/field (DFS), and each will point to an `llvm::Value`.
 
 The common code will then build the arguments list for the entrypoint call
 using the provided `llvm::Value`. This is shared across targets.
+
+The pseudo-code for this check should be:
+
+
+At this point, we are guaranteed to have only valid and unique semantics, as
+well as valid types.
+

From 2c92b36c04120523eae4c17641d901afde51dedc Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Nathan=20Gau=C3=ABr?= <brioche@google.com>
Date: Wed, 2 Jul 2025 15:15:43 +0200
Subject: [PATCH 3/9] rewrite proposal

---
 ...N-input-semantics.md => NNNN-semantics.md} | 193 ++++++++++++------
 1 file changed, 130 insertions(+), 63 deletions(-)
 rename proposals/{NNNN-input-semantics.md => NNNN-semantics.md} (65%)

diff --git a/proposals/NNNN-input-semantics.md b/proposals/NNNN-semantics.md
similarity index 65%
rename from proposals/NNNN-input-semantics.md
rename to proposals/NNNN-semantics.md
index 81d4dc8e..02190825 100644
--- a/proposals/NNNN-input-semantics.md
+++ b/proposals/NNNN-semantics.md
@@ -279,27 +279,41 @@ The attribute `.td` file must be changed to allow syntax-free semantics
 to be parsed.
 **NOTE**: All Spellings will be removed on already checked-in semantics.
 
-```
-class HLSLSemanticAttr : HLSLAnnotationAttr;
+```python
+class HLSLSemanticAttr<bit Indexable> : HLSLAnnotationAttr {
+  # SV_GroupID for ex cannot be indexed.
+  bit SemanticIndexable = Indexable;
+  # The index and wether it is explicit of not, ex: `USER0` vs `USER`.
+  int SemanticIndex = 0;
+  bit SemanticExplicitIndex = 0;
 
-def HLSLUnparsedSemantic : HLSLSemanticAttr {
   let Spellings = [];
-  let Args = [];
   let Subjects = SubjectList<[ParmVar, Field, Function]>;
   let LangOpts = [HLSL];
-  let Documentation = [InternalOnly];
 }
 
-def HLSLUserSemantic : HLSLSemanticAttr {
+# This is is used by the first parsing stage: all semantics are initially
+# set to HLSLUnparsedSemantic. Sema will then convert them to the final
+# form.
+def HLSLUnparsedSemantic : HLSLAnnotationAttr {
   let Spellings = [];
-  let Args = [DefaultIntArgument<"Location", 0>, DefaultBoolArgument<"Implicit", 0>];
+  let Args = [DefaultIntArgument<"Index", 0>,
+              DefaultBoolArgument<"ExplicitIndex", 0>];
   let Subjects = SubjectList<[ParmVar, Field, Function]>;
   let LangOpts = [HLSL];
   let Documentation = [InternalOnly];
 }
 
-def HLSLSV_GroupThreadID: HLSLSemanticAttr {
-  let Spellings = [];
+# User semantics will use this class.
+def HLSLUserSemantic : HLSLSemanticAttr</* Indexable= */ 1> {
+  let Documentation = [HLSLUserSemanticDocs];
+}
+
+# Known system semantics will have their own class with documentation.
+# Note: here indexable is set to false.
+def HLSLSV_GroupThreadID: HLSLSemanticAttr</* Indexable= */ 0> {
+  let Documentation = [HLSLSV_GroupThreadIDDocs];
+}
 ```
 
 During an attribute parsing, we first assign the `HLSLUnparsedSemantic` kind
@@ -311,22 +325,20 @@ When, when this attribute kind is parsed, we rely on Sema to emit the correct
 Sema will do stateless checks like:
   - Is this system semantic compatible with this shader stage?
   - Is this system semantic compatible with the type it's associated with?
+  - Is this system semantic indexable if an explicit index is used?
 
 We must also consider `MY_SEMANTIC0` to be equal to `MY_semantic`.
 The solution is to modify the `ParseHLSLAnnotations` function to add a custom
 attribute parsing function.
 
-```cpp
-struct ParsedSemantic {
-  // The normalized name of the semantic without index.
-  StringRef Name;
-  // The index of the semantic. 0 if implicit.
-  unsigned Index;
-  // Was the index explicit in the name or not.
-  bool Implicit;
-};
+The base-class `HLSLSemanticAttr` will expose two methods. During parsing,
+the index is parsed from the name, and stored in each attribute.
+The `SemanticExplicitIndex` is a bit useful for reflection and variable
+name regeneration.
 
-Parser::ParsedSemantic Parser::ParseHLSLSemantic();
+```cpp
+  bool isSemanticIndexable() const;
+  unsigned getSemanticIndex() const;
 ```
 
 ### Sema
@@ -351,27 +363,11 @@ No further checking is done as we must wait for codegen to move forward.
 DXIL and SPIR-V codegen will be very different, but the flattening/inheritance
 bit can be shared.
 
-The proposal is to provide a sorted list of valid semantics in `CGHLSLRuntime`,
-and then we can have a per-backend implementation for index assignment.
+The proposal is to have the whole semantic inheritance & validation shared,
+and at the very end allow each target to emit the BuiltIn/Location codegen.
 
-```cpp
-struct SemanticIO {
-  // The active semantic for this scalar/field.
-  HLSLAnnotationAttr *Semantic;
 
-  // Info about this field/scalar.
-  DeclaratorDecl *Decl;
-  llvm::Type *Type;
-
-  // The loaded value in the wrapper for this scalar/field. Each target
-  // must provide this value.
-  llvm::Value *Value = nullptr;
-};
-
-Vector<SemanticIO> InputSemantics;
-```
-
-The pseudo code would be as follows
+The pseudo code for the `emitEntryFunction` would be as follows:
 
 ```python
 
@@ -379,44 +375,115 @@ The pseudo code would be as follows
     # Current emitEntryFunction code in CGHLSLRuntime.cpp.
     ...
 
-    semantics = {}
+    args = []
+    outputs = []
 
-    foreach (Parameter, ReturnValue) in FunctionDecl:
-      if item is output:
-        # output parameters.
-        var = createLocalVar(item)
-        semantics[item.semantic_name] = getPointerTo(var)
-      elif item is byval:
+    foreach item in FunctionDecl.GetParamDecl():
+      if item is sret_output:
+        # Struct return values are using sret mechanism.
+        var = createLocalVar(item.type)
+        outputs.append({ item, var })
+      elif item is byval_input:
         # Semantic structs passed as input.
         var = createLocalVar(item)
-        loadSemanticRecursivelyToVariable(item, var)
-        semantics[item.semantic_name] = getPointerTo(var)
+        value = loadSemanticRecursively(item, var)
+        store(value, var)
+        args.append(var.getPointer())
       elif item is input:
-        semantics[item.semantic_name] = loadSemanticRecursively(item)
+        # Values passed by copy
+        args.append(loadSemanticRecursively(item))
 
+    call_return_value = createCall(MainFunction, args)
 
-    Args = [ semantics[x->semantic_name] for x in AllParameters ]
-    call = createCall(MainFunction, Args)
     if not call->isVoid():
-      StoreOutputSemanticRecursively(call, semantics)
+      output.append(FunctionDecl, call_return_value)
+
+    for [decl, item] in outputs:
+      value = load(item)
+      storeSemanticRecusively(decl, value)
 ```
 
-The proposal is to let each target implement the logic in CGHLSLRuntime to
-load the semantics, and set the `Value` field of the `SemanticIO` struct.
-The order of the vector represents the flattening order.
+In this code, two main functions are to write:
+ - `storeSemanticRecursively`
+ - `loadSemanticRecursively`
+
+Both will be quite similar, since both follow the same semantic
+indexing/inheritance rules.
+
+Pseudo code would be:
+
+```python
+
+def loadSemanticRecursively(decl, appliedSemantic = None):
+  if (decl->isStruct())
+    return loadSemanticStructRecurively()
+  return loadSemanticScalarRecursively()
+
+def loadSemanticStructRecurively(decl, &appliedSemantic)
+
+  if appliedSemantic is None:
+    appliedSemantic = decl->getSemantic()
+
+  output = createEmptyStruct(decl->getType())
+  for field in decl->structDecl():
+    tmp = copy(appliedSemantic)
+    val = loadSemanticRecursively(decl, tmp)
+    output.insert(val, field.index)
+  return output
 
-Providing the full list should allow DXIL to easily implement packing rules.
+def loadSemanticScalarRecursively(decl, &appliedSemantic):
+  if appliedSemantic is None:
+    appliedSemantic = decl->getSemantic()
 
-At this stage, `CGHLSLRuntime` will have an ordered list of `SemanticIO`
-structs. The order is from the first parameter/field to the last
-parameter/field (DFS), and each will point to an `llvm::Value`.
+  if appliedSemantic is None:
+    raise ("semantic is required")
 
-The common code will then build the arguments list for the entrypoint call
-using the provided `llvm::Value`. This is shared across targets.
+  if appliedSemantic->isUserSemantic():
+    return emitUserSemanticLoad(decl, appliedSemantic)
+  return emitSystemSemanticLoad(decl, appliedSemantic)
 
-The pseudo-code for this check should be:
+def emitSystemSemanticLoad(decl, &appliedSemantic):
+  if appliedSemantic == SV_Position:
+    # For SPIR-V for ex, logic is the same as user semantics.
+    return emitUserSemanticLoad(decl, appliedSemantic)
 
+  # But compute semantics based on builtins are different.
+  if not this->ActiveInputSemantic.insert(appliedSemantic.Name):
+    raise "duplicate semantic"
 
-At this point, we are guaranteed to have only valid and unique semantics, as
-well as valid types.
+  if appliedSemantic == SV_GroupID:
+    return emitSystemSemanticLoad_TARGET(decl, appliedSemantic)
+
+  raise "Unknown system semantic"
+
+def emitUserSemanticLoad(decl, &appliedSemantic):
+  Length = decl->isArray() ? decl->getArraySize() : 1
+
+  # Mark each index as busy. Some system semantics also require this,
+  # the example above shows the compute semantic which has no index.
+  for I in Length:
+    SemanticName = appliedSemantic.SemanticName + I
+    if not this->ActiveInputSemantic.insert(semanticName):
+      raise "Duplicate semantic index"
+    appliedSemantic.Index += 1
+
+  # For SPIR-V, emit a global with a Location ID.
+  return emitUserSemanticLoad_TARGET(decl, appliedSemantic)
+
+def emitSystemSemanticLoad_TARGET(decl, &appliedSemantic):
+  # Each target will emit the required code. This lives in CGHLSLRuntime,
+  # meaning we can have state to determine packing rules, etc.
+
+```
+
+The proposal is to let each target implement the logic in CGHLSLRuntime to
+load the semantics after all checks. What we expect is to get a single
+value with the input semantic loaded.
+For store, same scenario: the target-specific code will take an `llvm::Value`
+with a non-aggregate value, and will have to store it to a semantic.
+Index collision, semantic inheritance and invalid system semantics are handled
+by the shared code.
+
+A demo branch can be found here:
+https://github.com/Keenuts/llvm-project/tree/hlsl-semantics
 

From eda3e0f552b5449422aad1c4fb1aaabf1daa8544 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Nathan=20Gau=C3=ABr?= <brioche@google.com>
Date: Fri, 18 Jul 2025 14:02:09 +0200
Subject: [PATCH 4/9] pr-feedback

---
 proposals/NNNN-semantics.md | 93 ++++++++++++++++++++++++-------------
 1 file changed, 60 insertions(+), 33 deletions(-)

diff --git a/proposals/NNNN-semantics.md b/proposals/NNNN-semantics.md
index 02190825..a126d462 100644
--- a/proposals/NNNN-semantics.md
+++ b/proposals/NNNN-semantics.md
@@ -23,58 +23,85 @@ System semantics are linked to specific parts of the pipeline, while
 user semantics are just a way for the user to link the output of a stage
 to the input of another stage.
 
-
 Shader semantic attributes can be applied to:
   - a function parameter
   - function return value
   - a struct field's declaration
 
-They only carry meaning when used on:
+The full semantic as used by the pipeline is composed of a case insensitive
+name and an index.
+When assigning a semantic attribute, a number may be appended to the name to
+indicate the starting index for the semantic assignment.  If no number is
+specified, the starting index is assumed to be `0`.
+Any semantic starting with the `SV_` prefix is considered to be a system
+semantic.
+
+Examples:
+ - `SEMANTIC0`, Name = SEMANTIC, Index = 0, user semantic
+ - `Semantic`, Name = SEMANTIC, Index = 0, user semantic
+ - `semANtIC12`, Name = SEMANTIC, Index = 12, user semantic
+ - `SV_Position`, Name = SV_POSITION, Index = 0, system semantic
+ - `SV_POSITION2`, Name = SV_POSITION, Index = 2, system semantic
+
+The HLSL language does not impose restriction on the indices except it has to
+be a positive integer or zero. Target APIs can apply additional restrictions.
+The same semantic (name+index) must only appear once on the inputs, and once
+on the outputs.
+
+Semantic attributes only carry meaning when used on:
   - an entry point parameter
   - entrypoint return value
-  - a struct type used by an entry point parameter or return value.
+  - fields of a struct used by an entry point parameter or return value.
 All other uses are simply ignored.
 
-When a semantic is applied to both a parameter/return value, and its type,
-the parameter/return value semantic applies, and the type's semantics are
-ignored.
-
-Each scalar in the entrypoint must have a semantic attribute attached, either
-directly or inherited from the parent struct type.
+A semantic attribute applied to a field, parameter, or function declaration
+will override all inner semantics on any fields contained in that
+declaration's type.
 
-When a semantic applies to an array type, each element of the array
-is considered to take one index space in the semantic.
-When a semantic applies to a struct, each scalar field (recursively) takes
-one index in the semantic.
+For entry functions, every parameter and non-void return value must have an
+assigned semantic. This semantic must come from either:
+ - a semantic attribute on the parameter
+ - a semantic attribute on all structure fields in the type's declaration.
+ - for return values, a semantic attribute on the function
 
 Shader semantics on function return value are output semantics.
 When applying to an entrypoint parameter, the `in`/`out`/`inout` qualifiers
 will determine if this is an input or output semantic.
 
-Each semantic is *usually* composed of two elements:
- - a case insensitive name
- - an index
-
-Any semantic starting with the `SV_` prefix is considered to be a system
-semantic. Some system semantics do not have indices, while other have. All
-user semantics have an index (implicit or explicit).
-
-The index can be either implicit (equal to 0), or explicit:
+Each stage has a fixed list of allowed system semantics for its
+inputs or outputs. If user semantics are allowed as input or output semantics
+depends on the shader stage being targeted.
 
- - `Semantic`, Name = SEMANTIC, Index = 0
- - `SEMANTIC0`, Name = SEMANTIC, Index = 0
- - `semANtIC12`, Name = SEMANTIC, Index = 12
+The semantic index correspond to a storage "row" in the pipeline.
+Each "row" can store at most a 4-component 32bit numeric value.
+This implies a struct with multiple `float4` fields will be stored over
+multiple "rows", hence will span over multiple semantic indices.
 
+Example:
+ - `float   s : MY_SEMANTIC` takes one row, implicitly set to `0`.
+ - `float   s : MY_SEMANTIC0` takes one row, explicitly set to `0`.
+ - `float4  s : MY_SEMANTIC0` takes one row, explicitly set to `0`.
+ - `double4 s : MY_SEMANTIC0` takes two rows, `0` and `1`.
+
+- Struct fields, arrays and matrices may require more than one "row" depending
+  on their dimensions.
+- Each array element, struct field, or matrix row starts on a new "row",
+  there is no packing.
+- Indices are assigned from the first element/field to the last, recursively
+  in a depth-first order.
+- An array of size N with elements taking M rows will take a total of
+  N x M rows.
+- Each semantic+index pair can appear once on the inputs, and once on the
+  outputs.
 
-The HLSL language does not impose restriction on the indices except it has to
-be a positive integer or zero. Target APIs can apply additional restrictions.
-The same semantic (name+index) must only appear once on the inputs, and once
-on the outputs.
+Example:
+ - `float arr[2] : MY_SEM` takes 2 rows, `0` and `1`.
+   Even if a row could store multiple floats, each array element starts on a
+   new row.
 
-Each stage has a fixed list of allowed system semantics for its
-inputs or outputs. If user semantics are allowed as input or output semantics
-depends on the shader stage being
-targeted.
+ - `double3 arr[2] : MY_SEM1` takes 4 rows, `1`, `2`, `3` and `4`.
+   Each double3 require 2 rows, and thus the array requires 4 rows.
+   `arr[0]` will take `1` and `2`, and `arr[1]` `3`, `4`.
 
 Examples:
 

From f80869edd38fc4dcad45261802ecb8b47f3b808e Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Nathan=20Gau=C3=ABr?= <brioche@google.com>
Date: Fri, 18 Jul 2025 14:03:28 +0200
Subject: [PATCH 5/9] pr-feedback

---
 proposals/NNNN-semantics.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/proposals/NNNN-semantics.md b/proposals/NNNN-semantics.md
index a126d462..d45f3c23 100644
--- a/proposals/NNNN-semantics.md
+++ b/proposals/NNNN-semantics.md
@@ -50,7 +50,7 @@ on the outputs.
 
 Semantic attributes only carry meaning when used on:
   - an entry point parameter
-  - entrypoint return value
+  - an entry point function or the return value
   - fields of a struct used by an entry point parameter or return value.
 All other uses are simply ignored.
 

From 1bd357f29b4e42956335fb5985ef7bdf5e8679a7 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Nathan=20Gau=C3=ABr?= <brioche@google.com>
Date: Wed, 30 Jul 2025 16:09:12 +0200
Subject: [PATCH 6/9] add mention of a warning on semantic override

---
 proposals/NNNN-semantics.md | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/proposals/NNNN-semantics.md b/proposals/NNNN-semantics.md
index d45f3c23..e797eb27 100644
--- a/proposals/NNNN-semantics.md
+++ b/proposals/NNNN-semantics.md
@@ -57,6 +57,8 @@ All other uses are simply ignored.
 A semantic attribute applied to a field, parameter, or function declaration
 will override all inner semantics on any fields contained in that
 declaration's type.
+When a semantic is overriden, the compiler shall emit a warning stating
+which semantic was overriden by the enclosing type or the function.
 
 For entry functions, every parameter and non-void return value must have an
 assigned semantic. This semantic must come from either:

From 0e1fafad655933119af0232a2fea92b1b1ccb801 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Nathan=20Gau=C3=ABr?= <brioche@google.com>
Date: Thu, 31 Jul 2025 13:49:26 +0200
Subject: [PATCH 7/9] assign number to proposal

---
 proposals/{NNNN-semantics.md => 0028-semantics.md} | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
 rename proposals/{NNNN-semantics.md => 0028-semantics.md} (99%)

diff --git a/proposals/NNNN-semantics.md b/proposals/0028-semantics.md
similarity index 99%
rename from proposals/NNNN-semantics.md
rename to proposals/0028-semantics.md
index e797eb27..56ec60ce 100644
--- a/proposals/NNNN-semantics.md
+++ b/proposals/0028-semantics.md
@@ -1,6 +1,6 @@
 # HLSL shader semantics
 
-* Proposal: [NNNN](http://NNNN-input-semantics.md)
+* Proposal: [0028](http://0028-input-semantics.md)
 * Author(s): [Nathan Gauër](https://github.com/Keenuts)
 * Status: **Design In Progress**
 

From fb5002a14c3cc47e6d6b5abe209301ee9643fd1e Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Nathan=20Gau=C3=ABr?= <brioche@google.com>
Date: Thu, 31 Jul 2025 13:50:52 +0200
Subject: [PATCH 8/9] change allocated number

---
 proposals/{0028-semantics.md => 0031-semantics.md} | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
 rename proposals/{0028-semantics.md => 0031-semantics.md} (99%)

diff --git a/proposals/0028-semantics.md b/proposals/0031-semantics.md
similarity index 99%
rename from proposals/0028-semantics.md
rename to proposals/0031-semantics.md
index 56ec60ce..fe46e317 100644
--- a/proposals/0028-semantics.md
+++ b/proposals/0031-semantics.md
@@ -1,6 +1,6 @@
 # HLSL shader semantics
 
-* Proposal: [0028](http://0028-input-semantics.md)
+* Proposal: [0031](http://0031-semantics.md)
 * Author(s): [Nathan Gauër](https://github.com/Keenuts)
 * Status: **Design In Progress**
 

From a0e951357fb51306ebba6e64329ffd913bbfdf3d Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Nathan=20Gau=C3=ABr?= <brioche@google.com>
Date: Thu, 31 Jul 2025 13:52:26 +0200
Subject: [PATCH 9/9] fix url

---
 proposals/0031-semantics.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/proposals/0031-semantics.md b/proposals/0031-semantics.md
index fe46e317..82e66979 100644
--- a/proposals/0031-semantics.md
+++ b/proposals/0031-semantics.md
@@ -1,6 +1,6 @@
 # HLSL shader semantics
 
-* Proposal: [0031](http://0031-semantics.md)
+* Proposal: [0031](0031-semantics.md)
 * Author(s): [Nathan Gauër](https://github.com/Keenuts)
 * Status: **Design In Progress**