Skip to content

Stricter clippy#19

Open
ur4t wants to merge 5 commits intoNVlabs:mainfrom
ur4t:cleanup
Open

Stricter clippy#19
ur4t wants to merge 5 commits intoNVlabs:mainfrom
ur4t:cleanup

Conversation

@ur4t
Copy link
Copy Markdown
Contributor

@ur4t ur4t commented Mar 17, 2026

No description provided.

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Mar 17, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@elibol
Copy link
Copy Markdown
Collaborator

elibol commented Mar 18, 2026

@ur4t I think this is probably good to go, but I'm going to review this once we have clippy up and running as part of the CI. Thanks for the contribution!

@ur4t
Copy link
Copy Markdown
Contributor Author

ur4t commented Mar 19, 2026

@elibol Understood. And I will rebase once CI clippy is merged.

@elibol
Copy link
Copy Markdown
Collaborator

elibol commented Mar 21, 2026

The first merge removes nightly dependency -- it was quite simple to resolve conflicts. The second merge was not -- I accepted all of your changes (they seemed reasonable).

Please make sure all tests are actually passing by running bash ./scripts/run_all_tests.sh. We're still in the process of getting CI to execute GPU code.

@elibol
Copy link
Copy Markdown
Collaborator

elibol commented Mar 21, 2026

/ok to test 7dcca41

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Mar 21, 2026

/ok to test 7dcca41

@elibol, there was an error processing your request: E2

See the following link for more information: https://docs.gha-runners.nvidia.com/cpr/e/2/

@elibol
Copy link
Copy Markdown
Collaborator

elibol commented Mar 21, 2026

/ok to test b7383a4

@elibol
Copy link
Copy Markdown
Collaborator

elibol commented Mar 22, 2026

@ur4t I had a few other higher priority items I needed to get in place and unfortunately it broke your PR.

If you want to continue working on this, I would revert the last merge and resolve conflicts a bit more carefully. I'll prioritize merging this if the checks are passing. I'm also happy to take over if you prefer.

@ur4t
Copy link
Copy Markdown
Contributor Author

ur4t commented Mar 22, 2026

I had a few other higher priority items I needed to get in place and unfortunately it broke your PR.

Never mind, it is not that painful with less reformatting.

I'm also happy to take over if you prefer.

Thanks. This PR is a bit oversized. I try to keep each commit clear and focused to ease reviewing, though some commits are not split ideally.

Please make sure all tests are actually passing by running bash ./scripts/run_all_tests.sh.

It seems that I introduced some breakages, I will locate and fix them soon.

Besides, I have find some benchmarks such as FMHA are broken on my machine even in main branch.

@ur4t ur4t force-pushed the cleanup branch 2 times, most recently from 606bf6c to fc3f472 Compare March 23, 2026 12:06
@ur4t
Copy link
Copy Markdown
Contributor Author

ur4t commented Mar 23, 2026

@elibol Now this PR is ready to go. All tests, examples and benchmarks work as expected.

Besides, I have find some benchmarks such as FMHA are broken on my machine even in main branch.

load_from_view_latency emits optimization_hints with latency specified, which is not officially supported on all architectures yet. Using load_from_view instead or patched cuda-tile, both work well.

@elibol
Copy link
Copy Markdown
Collaborator

elibol commented Mar 23, 2026

@ur4t can you file a bug for the FMHA issue? I'll update our docs on how to file issues but I think for now including the following information would help:

  • Your OS.
  • Your GPU and GPU architecture.
  • The error you're seeing.

Edit: We should fix it in a separate PR. This PR will be easier to review if most changes are just formatting / semantics preserving changes.

@elibol
Copy link
Copy Markdown
Collaborator

elibol commented Mar 23, 2026

/ok to test fc3f472

@ur4t
Copy link
Copy Markdown
Contributor Author

ur4t commented Mar 24, 2026

@elibol I did not notice that CI can not run tests currently. It is fixed now.

can you file a bug for the FMHA issue?

Personally I do not think it is cutile-rs's bug, and I find that cuda-tile has just release 13.2.0 which should fix this issue perfectly in theory. However cuda-tile prefers latest llvm/mlir and melior/mlir-sys do not.

@elibol
Copy link
Copy Markdown
Collaborator

elibol commented Mar 24, 2026

I did not notice that CI can not run tests currently. It is fixed now.

Sounds good. Thanks again for working on this. I'll review your PR more closely tomorrow morning.

Personally I do not think it is cutile-rs's bug, and I find that cuda-tile has just release 13.2.0 which should fix this issue perfectly in theory. However cuda-tile prefers latest llvm/mlir and melior/mlir-sys do not.

This is something we're trying to figure out right now. We might need to fork melior to make the necessary changes. One of our primary goals is a smooth install experience, and that's just not possible with the split in LLVM dependencies between cuda-tile and melior. I'm hoping both are happy with LLVM-22 so we can stick to that for a bit while we sort this out. We have a longer term solution, but it's a bit more invasive. I don't have the time to do it right now.

@ur4t
Copy link
Copy Markdown
Contributor Author

ur4t commented Mar 24, 2026

I'm hoping both are happy with LLVM-22 so we can stick to that for a bit while we sort this out.

mlir-sys has moved to LLVM-22 and melior is still going.

cuda-tile 13.2.0 contains a type renaming which is not even included in LLVM-23 branch yet. That is easy to add porting patch, but we do not know whether cuda-tile team will accept such compatibility maintenance burden.

@elibol
Copy link
Copy Markdown
Collaborator

elibol commented Mar 24, 2026

@ur4t I'm making a lot of changes to cuda-async internally. Can we create smaller PRs which introduce the clippy fixes, one for each crate? That will make the process a bit more manageable on my end.

In fact, I would hold off on clippy for cuda-async until we get the changes for that crate in.

@ur4t
Copy link
Copy Markdown
Contributor Author

ur4t commented Mar 25, 2026

@elibol cuda-async done in #28, other crates on the way.

@ur4t
Copy link
Copy Markdown
Contributor Author

ur4t commented Mar 26, 2026

@elibol cutile-macro in #46, cutile-compiler in #47, cutile in #48, examples and benchmarks in #49, further readability change excluded. I will split other modifications into standalone PRs and leave this PR(#19) to enable stricter clippy in CI.

@ur4t ur4t changed the title Cleanup and improve readability Stricter clippy Mar 26, 2026
@elibol
Copy link
Copy Markdown
Collaborator

elibol commented Mar 31, 2026

Sounds good. I'm reviewing and merging them as I have time.

@ur4t ur4t force-pushed the cleanup branch 2 times, most recently from 3f5999f to dc64ceb Compare April 22, 2026 02:55
@jollylili jollylili added the P2 label Apr 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants