Stricter clippy by ur4t · Pull Request #19 · NVlabs/cutile-rs

ur4t · 2026-03-17T17:08:24Z

No description provided.

copy-pr-bot · 2026-03-17T17:08:28Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

elibol · 2026-03-18T23:53:34Z

@ur4t I think this is probably good to go, but I'm going to review this once we have clippy up and running as part of the CI. Thanks for the contribution!

ur4t · 2026-03-19T04:49:56Z

@elibol Understood. And I will rebase once CI clippy is merged.

elibol · 2026-03-21T21:03:04Z

The first merge removes nightly dependency -- it was quite simple to resolve conflicts. The second merge was not -- I accepted all of your changes (they seemed reasonable).

Please make sure all tests are actually passing by running bash ./scripts/run_all_tests.sh. We're still in the process of getting CI to execute GPU code.

elibol · 2026-03-21T21:03:24Z

/ok to test 7dcca41

copy-pr-bot · 2026-03-21T21:03:29Z

/ok to test 7dcca41

@elibol, there was an error processing your request: E2

See the following link for more information: https://docs.gha-runners.nvidia.com/cpr/e/2/

elibol · 2026-03-21T21:04:46Z

/ok to test b7383a4

elibol · 2026-03-22T03:32:11Z

@ur4t I had a few other higher priority items I needed to get in place and unfortunately it broke your PR.

If you want to continue working on this, I would revert the last merge and resolve conflicts a bit more carefully. I'll prioritize merging this if the checks are passing. I'm also happy to take over if you prefer.

ur4t · 2026-03-22T15:39:43Z

I had a few other higher priority items I needed to get in place and unfortunately it broke your PR.

Never mind, it is not that painful with less reformatting.

I'm also happy to take over if you prefer.

Thanks. This PR is a bit oversized. I try to keep each commit clear and focused to ease reviewing, though some commits are not split ideally.

Please make sure all tests are actually passing by running bash ./scripts/run_all_tests.sh.

It seems that I introduced some breakages, I will locate and fix them soon.

Besides, I have find some benchmarks such as FMHA are broken on my machine even in main branch.

ur4t · 2026-03-23T12:27:22Z

@elibol Now this PR is ready to go. All tests, examples and benchmarks work as expected.

Besides, I have find some benchmarks such as FMHA are broken on my machine even in main branch.

load_from_view_latency emits optimization_hints with latency specified, which is not officially supported on all architectures yet. Using load_from_view instead or patched cuda-tile, both work well.

elibol · 2026-03-23T18:51:23Z

@ur4t can you file a bug for the FMHA issue? I'll update our docs on how to file issues but I think for now including the following information would help:

Your OS.
Your GPU and GPU architecture.
The error you're seeing.

Edit: We should fix it in a separate PR. This PR will be easier to review if most changes are just formatting / semantics preserving changes.

elibol · 2026-03-23T20:18:57Z

/ok to test fc3f472

ur4t · 2026-03-24T03:11:42Z

@elibol I did not notice that CI can not run tests currently. It is fixed now.

can you file a bug for the FMHA issue?

Personally I do not think it is cutile-rs's bug, and I find that cuda-tile has just release 13.2.0 which should fix this issue perfectly in theory. However cuda-tile prefers latest llvm/mlir and melior/mlir-sys do not.

elibol · 2026-03-24T03:22:45Z

I did not notice that CI can not run tests currently. It is fixed now.

Sounds good. Thanks again for working on this. I'll review your PR more closely tomorrow morning.

Personally I do not think it is cutile-rs's bug, and I find that cuda-tile has just release 13.2.0 which should fix this issue perfectly in theory. However cuda-tile prefers latest llvm/mlir and melior/mlir-sys do not.

This is something we're trying to figure out right now. We might need to fork melior to make the necessary changes. One of our primary goals is a smooth install experience, and that's just not possible with the split in LLVM dependencies between cuda-tile and melior. I'm hoping both are happy with LLVM-22 so we can stick to that for a bit while we sort this out. We have a longer term solution, but it's a bit more invasive. I don't have the time to do it right now.

ur4t · 2026-03-24T06:13:58Z

I'm hoping both are happy with LLVM-22 so we can stick to that for a bit while we sort this out.

mlir-sys has moved to LLVM-22 and melior is still going.

cuda-tile 13.2.0 contains a type renaming which is not even included in LLVM-23 branch yet. That is easy to add porting patch, but we do not know whether cuda-tile team will accept such compatibility maintenance burden.

elibol · 2026-03-24T17:38:35Z

@ur4t I'm making a lot of changes to cuda-async internally. Can we create smaller PRs which introduce the clippy fixes, one for each crate? That will make the process a bit more manageable on my end.

In fact, I would hold off on clippy for cuda-async until we get the changes for that crate in.

ur4t · 2026-03-25T03:44:19Z

@elibol cuda-async done in #28, other crates on the way.

ur4t · 2026-03-26T10:33:19Z

@elibol cutile-macro in #46, cutile-compiler in #47, cutile in #48, examples and benchmarks in #49, further readability change excluded. I will split other modifications into standalone PRs and leave this PR(#19) to enable stricter clippy in CI.

elibol · 2026-03-31T19:08:36Z

Sounds good. I'm reviewing and merging them as I have time.

ur4t force-pushed the cleanup branch from 1a611a2 to 7d749b1 Compare March 18, 2026 02:33

ur4t mentioned this pull request Mar 21, 2026

Remove dependence on nightly. #25

Merged

elibol mentioned this pull request Mar 21, 2026

Add fmt and clippy checks. #22

Merged

ur4t force-pushed the cleanup branch from b7383a4 to debc4d7 Compare March 22, 2026 14:50

ur4t force-pushed the cleanup branch 2 times, most recently from 606bf6c to fc3f472 Compare March 23, 2026 12:06

ur4t force-pushed the cleanup branch from fc3f472 to 956606b Compare March 24, 2026 02:22

ur4t force-pushed the cleanup branch from ce3daa3 to 3ef6e32 Compare March 26, 2026 10:35

ur4t changed the title ~~Cleanup and improve readability~~ Stricter clippy Mar 26, 2026

ur4t force-pushed the cleanup branch 2 times, most recently from 3f5999f to dc64ceb Compare April 22, 2026 02:55

jollylili added the P2 label Apr 24, 2026

ur4t added 5 commits April 27, 2026 08:01

Cleanup update cuda-core and cuda-async

456aefc

Cleanup updated cutile

8e50650

Cleanup updated cutile-macro

631054e

Cleanup cutile-ir

9b35b97

Update workflows

79b4f11

ur4t force-pushed the cleanup branch from dc64ceb to 79b4f11 Compare April 28, 2026 02:43

Conversation

ur4t commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

copy-pr-bot Bot commented Mar 17, 2026

Uh oh!

elibol commented Mar 18, 2026

Uh oh!

ur4t commented Mar 19, 2026

Uh oh!

elibol commented Mar 21, 2026

Uh oh!

elibol commented Mar 21, 2026

Uh oh!

copy-pr-bot Bot commented Mar 21, 2026

Uh oh!

elibol commented Mar 21, 2026

Uh oh!

elibol commented Mar 22, 2026

Uh oh!

ur4t commented Mar 22, 2026

Uh oh!

ur4t commented Mar 23, 2026

Uh oh!

elibol commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

elibol commented Mar 23, 2026

Uh oh!

ur4t commented Mar 24, 2026

Uh oh!

elibol commented Mar 24, 2026

Uh oh!

ur4t commented Mar 24, 2026

Uh oh!

elibol commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ur4t commented Mar 25, 2026

Uh oh!

ur4t commented Mar 26, 2026

Uh oh!

elibol commented Mar 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ur4t commented Mar 17, 2026 •

edited

Loading

elibol commented Mar 23, 2026 •

edited

Loading

elibol commented Mar 24, 2026 •

edited

Loading