Conversation
kshyatt
left a comment
There was a problem hiding this comment.
Apart from minor comment LGTM, this will be very helpful for the Enzyme PR
|
Your PR no longer requires formatting changes. Thank you for your contribution! |
|
Ope, GPU failures look related... |
|
Needs #390 first, but should then be ready. |
Codecov Report✅ All modified and coverable lines are covered by tests.
... and 2 files with indirect coverage changes 🚀 New features to boost your workflow:
|
ab125b5 to
e3ce090
Compare
| return (Vtr, Vℤ₃, VU₁, VfU₁, VCU₁, VSU₂, VfSU₂, VIB_diag, VIB_M) | ||
| end | ||
|
|
||
| # Gauge-fixing tangents for AD factorization tests |
There was a problem hiding this comment.
Clearly we should have these functions in the main MAK module?
There was a problem hiding this comment.
I think this is in line with the current MAK approach. I'm happy to migrate this, but probably we should then also do this for MAK itself so we can reuse the functions.
|
This generally looks good; I left a few small comments and questions. But clearly, this is too much change for a detailed review. Is there a convenient way to review such code reorganization, i.e. to separate between what has just moved to other files and what are actual changes. I could probably ask some agent, but I don't feel like doing that. |
|
I don't think there is, but I did try and not actually alter anything except for the organization of the tests.
In principle there is no reason to review the actual contents of the test files, since these are unchanged, which also explains some of the things you commented on. |
|
Ok, I think I made the changes that I wanted here. I also synchronized the CUDA tests to be more in line with the CPU tensor and factorization tests in terms of tensors (with a little help from my friends However, without being able to run these locally, nor seeing the detailed progress report, this is very hard to debug. Do you know how to get more details out of logs of the aborted buildkite runs @kshyatt ? Or do you want to take a look. If we do not want to special case the multifusion cases, the only tensors that are valid are: "square" tensors (same domain or codomain), where the domain is either or "rectangular" tensors which are some "cyclic" permutations (i.e. transpositions) of So things like |
|
Will definitely take a look though it may have to wait until after RVV ;)
…On Sun, Apr 5, 2026 at 1:32 PM Jutho ***@***.***> wrote:
*Jutho* left a comment (QuantumKitHub/TensorKit.jl#389)
<#389 (comment)>
Ok, I think I made the changes that I wanted here. I also synchronized the
CUDA tests to be more in line with the CPU tensor and factorization tests
in terms of tensors (with a little help from my friends [image:
|
|
|
||
| # Sector utility | ||
| # -------------- | ||
| smallset(::Type{I}) where {I <: Sector} = take(values(I), 5) |
There was a problem hiding this comment.
Do we want to import the TensorKitSectors testsuite to have the same helper functions, or just manually carry them over here? Either way, I think it's useful to be using the same smallsets.
|
I was able to run the CUDA tests locally, I'll try to fix them tomorrow. Here are the outputs: Looks like the new A_4 tests are absolutely wrecking everything on time... |
|
Additional problem is the |
Oh I removed that one because it wasn't working correctly. You want to ensure that, if you use The old @test abs(dim(domain(d)) - nvals) ≤ maximum(c -> blockdim(domain(t), c), blocksectors(t); init = 1)I thought I replaced them all, but apparently not. |
|
A4Irrep is added to have |
If that's the case, then would it make sense to try to get #393 in before this? Or not support the Also, I am somewhat dubious of the mixed cpu-gpu explanation as that should rather throw a There are only 4 cases in the norm preservation situation, let me see if I can narrow down where the problem is. |
|
@Jutho some timing is showing that, for the norm preservation case, the vast vast majority of time is spent in the tensor product step. Since this is the first time the @timedtestset "Tensor product: test via norm preservation" begin
for T in (ComplexF64, ) # Float32 case broken because of cuTENSOR
@time "Construction" begin
if UnitStyle(I) isa SimpleUnit || !isempty(blocksectors(V2 ⊗ V1))
t1 = CUDA.rand(T, V2 ⊗ V3 ⊗ V1, V1 ⊗ V2)
t2 = CUDA.rand(T, V2 ⊗ V1 ⊗ V3, V1 ⊗ V1)
else
t1 = CUDA.rand(T, V3 ⊗ V4 ⊗ V5, (V1 ⊗ V2)')
t2 = CUDA.rand(T, (V3 ⊗ V4 ⊗ V5)', V1 ⊗ V2)
end
end
@time "Product" begin
t = @constinferred (t1 ⊗ t2)
end
@time "Norm" begin
@test norm(t) ≈ norm(t1) * norm(t2)
end
end
end
symmetricbraiding && @timedtestset "Tensor product: test via conversion" begin
for T in (Float32, ComplexF64)
t1 = CUDA.rand(T, V2 ⊗ V3 ⊗ V1, V1)
t2 = CUDA.rand(T, V2 ⊗ V1 ⊗ V3, V2)
d1 = dim(codomain(t1))
d2 = dim(codomain(t2))
d3 = dim(domain(t1))
d4 = dim(domain(t2))
t = @constinferred (t1 ⊗ t2)
At = ad(t)
@test ad(t) ≈ ad(t1) ⊗ ad(t2)
end
end For the top case, we have |
This is something I noticed in the codecov PR, and I tried fixing with something like this. The spaces have shifted since, but it was something I saw locally already, and probably carries over to the GPU tests. |
|
Indeed cutting the spaces a bit helped tremendously, however we're still being hit hard by the permutation tests which I will now look at. More and more I think getting the changes in #393 will help, if anyone wants to review it 🥺 |
|
Not sure what happened with this: if UnitStyle(I) isa SimpleUnit || !isempty(blocksectors(V2 ⊗ V1))
t1 = CUDA.rand(T, V2 ⊗ V3 ⊗ V1, V1 ⊗ V2)
t2 = CUDA.rand(T, V2 ⊗ V1 ⊗ V3, V1 ⊗ V1)
else
t1 = CUDA.rand(T, V3 ⊗ V4 ⊗ V5, (V1 ⊗ V2)')
t2 = CUDA.rand(T, (V3 ⊗ V4 ⊗ V5)', V1 ⊗ V2)
end If this is something I have inserted, it might have been an oversight from something copilot generated. I was aiming to avoid all special casing, and so the only universally valid case is the |
|
Copilot flew the tests directly into the side of the mountain! 😭 |
|
Oh wait, I now see the same kind of constructions exist in the CPU tests. I did try to carefully check all copilot changes, so the reason I do not remember this is probably because it was like this from the before. I mainly focussed on the factorization tests, as all the tensor tests seemed fine. |
|
I'll push a commit with some smaller tests. I did not run them locally, so let's hope for the best. |
|
Buildkite now succeeding!!! |
| @tensor t′[1 2 3; 4 5] := t1[1; 4] * t2[2 3; 5] | ||
| CUDA.@allowscalar begin | ||
| @tensor t′[1 2 3; 4 5] := t1[1; 4] * t2[2 3; 5] | ||
| end |
There was a problem hiding this comment.
Why does this need an allowscalar? (just to understand what is still missing).
There was a problem hiding this comment.
It's missing the changes in the BraidingTensor PR -- a BraidingTensor is arising in this contraction for the Irrep[CU₁] case. Once we get the BraidingTensor changes in this can be removed
That is great news. The chainrules test still take quite long and time out on Windows. The tensor contraction AD test is repeated 5 times over with random contraction patterns, so that cost can easily be cut down by lowering the number of repetitions. Also thanks to @borisdevos ; your commits contain useful improvements. |
|
This is the output of a local run of chainrules/tensoroperations.jl : ---------------------------------------
Auto-diff with symmetry: Trivial
---------------------------------------
Test Summary: | Pass Total Time
ChainRules for tensor operations with symmetry Trivial | 1434 1434 1m30.8s
scalartype Float64 | 717 717 45.9s
scalartype ComplexF64 | 717 717 44.9s
---------------------------------------
Auto-diff with symmetry: Irrep[ℤ₂]
---------------------------------------
Test Summary: | Pass Total Time
ChainRules for tensor operations with symmetry Irrep[ℤ₂] | 1582 1582 1m26.8s
scalartype Float64 | 811 811 44.9s
scalartype ComplexF64 | 771 771 41.9s
---------------------------------------
Auto-diff with symmetry: Irrep[CU₁]
---------------------------------------
Test Summary: | Pass Total Time
ChainRules for tensor operations with symmetry Irrep[CU₁] | 1746 1746 31m46.5s
scalartype Float64 | 885 885 11m54.9s
scalartype ComplexF64 | 861 861 19m51.5s
---------------------------------------
Auto-diff with symmetry: (FermionParity ⊠ Irrep[SU₂] ⊠ Irrep[U₁])
---------------------------------------
Test Summary: | Pass Total Time
ChainRules for tensor operations with symmetry (FermionParity ⊠ Irrep[SU₂] ⊠ Irrep[U₁]) | 1806 1806 202m42.4s
scalartype Float64 | 905 905 2m55.2s
scalartype ComplexF64 | 901 901 199m47.2sClearly the last two symmetry types are the culprit, but I would need to add more detailed timing statements to know the true origin. I checked that tensors with spaces which are related to What is very strange is that for |
Summary
test/Project.toml, separating them from the main package manifest--fastmode that skips AD test groups and reduces sector/scalar-type coverage for quick iterationtest/README.mddocumenting how to run tests, available groups, fast mode, and how to add new test files