Tr/cudaconvert#2491
Merged
imreddyTeja merged 2 commits intomainfrom May 4, 2026
Merged
Conversation
3ac6218 to
b7b9e27
Compare
099d63f to
844d388
Compare
844d388 to
83bdb44
Compare
83bdb44 to
b858fa3
Compare
This better reflects the launch latency from the ClimaCore side.
b858fa3 to
000d192
Compare
Many of the functions that launch kernels convert the args twice. Once for getting the config, and once for the actual launch. This is an expensive operation, so it makes sense to just do it once. Use config_via_occupancy in cudaext data_layouts_copyto [perf]
000d192 to
0934bff
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Edits the cuda ext to only cudaconvert the kernel args once. Previously, this happened once to determine the launch config, and the once again during the actual launch.
Expands usage of
config_via_occupancyfor better launch configs in low res cases.This provides a minor speedup in ClimaLand