Implemented MPSGraphs.graph_conv!#745
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
There was a problem hiding this comment.
Metal Benchmarks
Details
| Benchmark suite | Current: 79afbf7 | Previous: 1d2f000 | Ratio |
|---|---|---|---|
latency/precompile |
26423261083 ns |
25549419083 ns |
1.03 |
latency/ttfp |
2446354000 ns |
2346831687.5 ns |
1.04 |
latency/import |
1499449000 ns |
1427666042 ns |
1.05 |
integration/metaldevrt |
520209 ns |
877750 ns |
0.59 |
integration/byval/slices=1 |
1193708 ns |
1568625 ns |
0.76 |
integration/byval/slices=3 |
7839459 ns |
8402792 ns |
0.93 |
integration/byval/reference |
1191042 ns |
1559958 ns |
0.76 |
integration/byval/slices=2 |
2116542 ns |
2629875 ns |
0.80 |
kernel/indexing |
261958 ns |
627417 ns |
0.42 |
kernel/indexing_checked |
274416 ns |
608750 ns |
0.45 |
kernel/launch |
12834 ns |
12667 ns |
1.01 |
kernel/rand |
294459 ns |
576167 ns |
0.51 |
array/construct |
6833 ns |
6500 ns |
1.05 |
array/broadcast |
286500 ns |
606708 ns |
0.47 |
array/random/randn/Float32 |
494125 ns |
1011104 ns |
0.49 |
array/random/randn!/Float32 |
420292 ns |
753875 ns |
0.56 |
array/random/rand!/Int64 |
317875 ns |
548708 ns |
0.58 |
array/random/rand!/Float32 |
278416 ns |
586208.5 ns |
0.47 |
array/random/rand/Int64 |
480541 ns |
789709 ns |
0.61 |
array/random/rand/Float32 |
353833.5 ns |
645000 ns |
0.55 |
array/accumulate/Int64/1d |
964083 ns |
1260667 ns |
0.76 |
array/accumulate/Int64/dims=1 |
1044271 ns |
1859104.5 ns |
0.56 |
array/accumulate/Int64/dims=2 |
1375875 ns |
2179083 ns |
0.63 |
array/accumulate/Int64/dims=1L |
9593146 ns |
11673271 ns |
0.82 |
array/accumulate/Int64/dims=2L |
7797584 ns |
9628146 ns |
0.81 |
array/accumulate/Float32/1d |
801083 ns |
1121395.5 ns |
0.71 |
array/accumulate/Float32/dims=1 |
905541.5 ns |
1571667 ns |
0.58 |
array/accumulate/Float32/dims=2 |
1185459 ns |
1889459 ns |
0.63 |
array/accumulate/Float32/dims=1L |
8556479 ns |
9834209 ns |
0.87 |
array/accumulate/Float32/dims=2L |
4416624.5 ns |
7249666.5 ns |
0.61 |
array/reductions/reduce/Int64/1d |
751583 ns |
1386875 ns |
0.54 |
array/reductions/reduce/Int64/dims=1 |
690667 ns |
1117250 ns |
0.62 |
array/reductions/reduce/Int64/dims=2 |
726250 ns |
1152958 ns |
0.63 |
array/reductions/reduce/Int64/dims=1L |
1162854 ns |
2013209 ns |
0.58 |
array/reductions/reduce/Int64/dims=2L |
2227750 ns |
4244083 ns |
0.52 |
array/reductions/reduce/Float32/1d |
564250 ns |
988750 ns |
0.57 |
array/reductions/reduce/Float32/dims=1 |
403750 ns |
843520.5 ns |
0.48 |
array/reductions/reduce/Float32/dims=2 |
434041 ns |
857917 ns |
0.51 |
array/reductions/reduce/Float32/dims=1L |
682417 ns |
1326625 ns |
0.51 |
array/reductions/reduce/Float32/dims=2L |
1245770.5 ns |
1810667 ns |
0.69 |
array/reductions/mapreduce/Int64/1d |
754167 ns |
1356437.5 ns |
0.56 |
array/reductions/mapreduce/Int64/dims=1 |
690459 ns |
1102166.5 ns |
0.63 |
array/reductions/mapreduce/Int64/dims=2 |
726583 ns |
1149750 ns |
0.63 |
array/reductions/mapreduce/Int64/dims=1L |
1145334 ns |
1988375 ns |
0.58 |
array/reductions/mapreduce/Int64/dims=2L |
1856375 ns |
3626916 ns |
0.51 |
array/reductions/mapreduce/Float32/1d |
580583 ns |
1055917 ns |
0.55 |
array/reductions/mapreduce/Float32/dims=1 |
404292 ns |
847396 ns |
0.48 |
array/reductions/mapreduce/Float32/dims=2 |
430667 ns |
860979.5 ns |
0.50 |
array/reductions/mapreduce/Float32/dims=1L |
681208 ns |
1333042 ns |
0.51 |
array/reductions/mapreduce/Float32/dims=2L |
1250500 ns |
1898125 ns |
0.66 |
array/private/copyto!/gpu_to_gpu |
233041 ns |
633020.5 ns |
0.37 |
array/private/copyto!/cpu_to_gpu |
257458 ns |
804354.5 ns |
0.32 |
array/private/copyto!/gpu_to_cpu |
257000 ns |
816000 ns |
0.31 |
array/private/iteration/findall/int |
1208812.5 ns |
1581312.5 ns |
0.76 |
array/private/iteration/findall/bool |
1074583 ns |
1404916.5 ns |
0.76 |
array/private/iteration/findfirst/int |
1196167 ns |
2075167 ns |
0.58 |
array/private/iteration/findfirst/bool |
1188417 ns |
2048750 ns |
0.58 |
array/private/iteration/scalar |
1725729.5 ns |
4526479 ns |
0.38 |
array/private/iteration/logical |
1650375 ns |
2693625 ns |
0.61 |
array/private/iteration/findmin/1d |
1422145.5 ns |
2518041 ns |
0.56 |
array/private/iteration/findmin/2d |
1225250 ns |
1820229.5 ns |
0.67 |
array/private/copy |
333042 ns |
568854 ns |
0.59 |
array/shared/copyto!/gpu_to_gpu |
84042 ns |
84291 ns |
1.00 |
array/shared/copyto!/cpu_to_gpu |
79875 ns |
82875 ns |
0.96 |
array/shared/copyto!/gpu_to_cpu |
78479.5 ns |
83000 ns |
0.95 |
array/shared/iteration/findall/int |
1215500 ns |
1585854.5 ns |
0.77 |
array/shared/iteration/findall/bool |
1081833 ns |
1421875 ns |
0.76 |
array/shared/iteration/findfirst/int |
1006125 ns |
1654709 ns |
0.61 |
array/shared/iteration/findfirst/bool |
998000 ns |
1643542 ns |
0.61 |
array/shared/iteration/scalar |
192000 ns |
210375 ns |
0.91 |
array/shared/iteration/logical |
1447708 ns |
2297959 ns |
0.63 |
array/shared/iteration/findmin/1d |
1229333 ns |
2134229 ns |
0.58 |
array/shared/iteration/findmin/2d |
1201334 ns |
1806042 ns |
0.67 |
array/shared/copy |
241792 ns |
241812 ns |
1.00 |
array/permutedims/4d |
1726708 ns |
2395583 ns |
0.72 |
array/permutedims/2d |
561417 ns |
1158833 ns |
0.48 |
array/permutedims/3d |
1141042 ns |
1686541 ns |
0.68 |
metal/synchronization/stream |
17334 ns |
19583 ns |
0.89 |
metal/synchronization/context |
17958 ns |
20291 ns |
0.89 |
This comment was automatically generated by workflow using github-action-benchmark.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #745 +/- ##
==========================================
+ Coverage 82.01% 82.48% +0.47%
==========================================
Files 62 63 +1
Lines 2874 2929 +55
==========================================
+ Hits 2357 2416 +59
+ Misses 517 513 -4 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Thanks, I'm very happy to see this being tackled properly! Just as a curiosity:I was missing this functionality but unfortunately this is far outside my expertise. So in my desperation I asked Claude for help and surprisingly it got it to work! I'm sure it's much less elegant than this solution, but in case someone is interested: I tried to do it carefully, so it should be not just pure AI slop. But given my lack of understanding, I cannot guarantee anything. |
This should allow closing #210