Skip to content

Add pointer to integer conversion for alignment checking.#768

Merged
maleadt merged 2 commits intomainfrom
tb/ptr_to_int
Apr 20, 2026
Merged

Add pointer to integer conversion for alignment checking.#768
maleadt merged 2 commits intomainfrom
tb/ptr_to_int

Conversation

@maleadt
Copy link
Copy Markdown
Member

@maleadt maleadt commented Apr 17, 2026

No description provided.

maleadt and others added 2 commits April 17, 2026 15:10
The GPU virtual address is not guaranteed to be page-aligned, even though
the CPU-side buffer contents are. CI showed `gpuAddress` landing on
256 B-aligned but non-4096 B-aligned offsets, which is still plenty for
SIMD use cases. Tighten the test to the alignment that actually matters.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Metal Benchmarks

Details
Benchmark suite Current: 3a6dc63 Previous: 65fac52 Ratio
array/accumulate/Float32/1d 1125500 ns 1128208 ns 1.00
array/accumulate/Float32/dims=1 1568166.5 ns 1568833 ns 1.00
array/accumulate/Float32/dims=1L 9867375 ns 9858729 ns 1.00
array/accumulate/Float32/dims=2 1885792 ns 1892646.5 ns 1.00
array/accumulate/Float32/dims=2L 7240167 ns 7250166.5 ns 1.00
array/accumulate/Int64/1d 1233458 ns 1259896 ns 0.98
array/accumulate/Int64/dims=1 1856854 ns 1847062 ns 1.01
array/accumulate/Int64/dims=1L 11753833 ns 11726000 ns 1.00
array/accumulate/Int64/dims=2 2183417 ns 2318000 ns 0.94
array/accumulate/Int64/dims=2L 9830584 ns 9830125 ns 1.00
array/broadcast 606792 ns 598708 ns 1.01
array/construct 5875 ns 6542 ns 0.90
array/permutedims/2d 1167709 ns 1176541 ns 0.99
array/permutedims/3d 1679417 ns 1688520.5 ns 0.99
array/permutedims/4d 2413812.5 ns 2396000 ns 1.01
array/private/copy 562250 ns 579729 ns 0.97
array/private/copyto!/cpu_to_gpu 793000 ns 796042 ns 1.00
array/private/copyto!/gpu_to_cpu 791604.5 ns 795417 ns 1.00
array/private/copyto!/gpu_to_gpu 641021 ns 636875 ns 1.01
array/private/iteration/findall/bool 1414458 ns 1414604 ns 1.00
array/private/iteration/findall/int 1570875 ns 1591166 ns 0.99
array/private/iteration/findfirst/bool 2045375 ns 2054000 ns 1.00
array/private/iteration/findfirst/int 2091854.5 ns 2060375 ns 1.02
array/private/iteration/findmin/1d 2501458 ns 2517666 ns 0.99
array/private/iteration/findmin/2d 1785667 ns 1813000 ns 0.98
array/private/iteration/logical 2631187.5 ns 2548812.5 ns 1.03
array/private/iteration/scalar 5608666 ns 4792583 ns 1.17
array/random/rand/Float32 1175291.5 ns 1121562.5 ns 1.05
array/random/rand/Int64 1322521 ns 1323083 ns 1.00
array/random/rand!/Float32 921042 ns 920541.5 ns 1.00
array/random/rand!/Int64 869771 ns 863333 ns 1.01
array/random/randn/Float32 1059500 ns 1076000 ns 0.98
array/random/randn!/Float32 812292 ns 823750 ns 0.99
array/reductions/mapreduce/Float32/1d 1038834 ns 1056312.5 ns 0.98
array/reductions/mapreduce/Float32/dims=1 837042 ns 830958 ns 1.01
array/reductions/mapreduce/Float32/dims=1L 1330625 ns 1339562.5 ns 0.99
array/reductions/mapreduce/Float32/dims=2 861500 ns 851417 ns 1.01
array/reductions/mapreduce/Float32/dims=2L 1814875 ns 1815500 ns 1.00
array/reductions/mapreduce/Int64/1d 1518584 ns 1363250 ns 1.11
array/reductions/mapreduce/Int64/dims=1 1108395.5 ns 1094750 ns 1.01
array/reductions/mapreduce/Int64/dims=1L 2019584 ns 2059500 ns 0.98
array/reductions/mapreduce/Int64/dims=2 1156625 ns 1137917 ns 1.02
array/reductions/mapreduce/Int64/dims=2L 3626084 ns 3627875 ns 1.00
array/reductions/reduce/Float32/1d 1036375 ns 1046000 ns 0.99
array/reductions/reduce/Float32/dims=1 834979.5 ns 831959 ns 1.00
array/reductions/reduce/Float32/dims=1L 1336875 ns 1343541.5 ns 1.00
array/reductions/reduce/Float32/dims=2 851917 ns 848083 ns 1.00
array/reductions/reduce/Float32/dims=2L 1813229 ns 1824875 ns 0.99
array/reductions/reduce/Int64/1d 1506229.5 ns 1358416.5 ns 1.11
array/reductions/reduce/Int64/dims=1 1102000 ns 1091083.5 ns 1.01
array/reductions/reduce/Int64/dims=1L 2021416 ns 2034021 ns 0.99
array/reductions/reduce/Int64/dims=2 1156250 ns 1136375 ns 1.02
array/reductions/reduce/Int64/dims=2L 4242791.5 ns 4226792 ns 1.00
array/shared/copy 238208.5 ns 244395.5 ns 0.97
array/shared/copyto!/cpu_to_gpu 82333 ns 84292 ns 0.98
array/shared/copyto!/gpu_to_cpu 83208 ns 83417 ns 1.00
array/shared/copyto!/gpu_to_gpu 83833 ns 84812.5 ns 0.99
array/shared/iteration/findall/bool 1428208 ns 1431709 ns 1.00
array/shared/iteration/findall/int 1566125 ns 1582208 ns 0.99
array/shared/iteration/findfirst/bool 1632500 ns 1644667 ns 0.99
array/shared/iteration/findfirst/int 1654208 ns 1654500 ns 1.00
array/shared/iteration/findmin/1d 2119416.5 ns 2133875 ns 0.99
array/shared/iteration/findmin/2d 1779709 ns 1812875 ns 0.98
array/shared/iteration/logical 2387770.5 ns 2292812.5 ns 1.04
array/shared/iteration/scalar 207458 ns 212958 ns 0.97
integration/byval/reference 1581583 ns 1580375 ns 1.00
integration/byval/slices=1 1595958 ns 1592916.5 ns 1.00
integration/byval/slices=2 2610271 ns 2616688 ns 1.00
integration/byval/slices=3 7741208 ns 7765750.5 ns 1.00
integration/metaldevrt 868334 ns 878583 ns 0.99
kernel/indexing 637354.5 ns 628917 ns 1.01
kernel/indexing_checked 642500 ns 638312.5 ns 1.01
kernel/launch 11458 ns 12375 ns 0.93
kernel/rand 572459 ns 569750 ns 1.00
latency/import 1428755583 ns 1418064812.5 ns 1.01
latency/precompile 25474981958 ns 25582527375 ns 1.00
latency/ttfp 2344123250 ns 2338611062.5 ns 1.00
metal/synchronization/context 20125 ns 20042 ns 1.00
metal/synchronization/stream 19354.5 ns 19041 ns 1.02

This comment was automatically generated by workflow using github-action-benchmark.

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 17, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 80.68%. Comparing base (65fac52) to head (3a6dc63).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #768      +/-   ##
==========================================
+ Coverage   80.45%   80.68%   +0.23%     
==========================================
  Files          61       61              
  Lines        2855     2848       -7     
==========================================
+ Hits         2297     2298       +1     
+ Misses        558      550       -8     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Member

@christiangnrd christiangnrd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m away from my computer so I can’t test locally but the tests seem good and 16 bytes alignment seems like a reasonable assumption so lgtm!

@maleadt maleadt merged commit acbbd34 into main Apr 20, 2026
16 checks passed
@maleadt maleadt deleted the tb/ptr_to_int branch April 20, 2026 13:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants