Validate total_num_blocks divisibility by my_size in block_bucketize (#5646) by q10 · Pull Request #5649 · pytorch/FBGEMM

q10 · 2026-04-16T21:19:11Z

Summary:
X-link: https://github.com/facebookresearch/FBGEMM/pull/2596

Add input validation for total_num_blocks divisibility by my_size in block_bucketize_sparse_features to prevent silent buffer overflows.

Test Plan:

Full Suites

Requires: Any CUDA GPU (has CPU fallback)

buck2 test fbcode//deeplearning/fbgemm/fbgemm_gpu/test/sparse:block_bucketize
buck2 test fbcode//deeplearning/fbgemm/fbgemm_gpu/test/sparse:block_bucketize_2d_weights

Individual Tests (new in this diff)

buck2 test fbcode//deeplearning/fbgemm/fbgemm_gpu/test/sparse:block_bucketize -- -k test_block_bucketize_sparse_features_total_num_blocks_not_divisible
buck2 test fbcode//deeplearning/fbgemm/fbgemm_gpu/test/sparse:block_bucketize_2d_weights -- -k test_block_bucketize_sparse_features_2d_weights_total_num_blocks_not_divisible

Related Tests (validates existing behavior not broken)

buck2 test fbcode//deeplearning/fbgemm/fbgemm_gpu/test/sparse:block_bucketize -- -k total_num_blocks

Benchmark (does NOT pass total_num_blocks, zero overhead on common path)

buck2 run @//mode/opt fbcode//deeplearning/fbgemm/fbgemm_gpu/bench:sparse_ops -- block-bucketize-sparse-features-bench

Reviewed By: henrylhtsang

Differential Revision: D101141810

Pulled By: q10

…ytorch#5646) Summary: X-link: facebookresearch/FBGEMM#2596 Add input validation for `total_num_blocks` divisibility by `my_size` in `block_bucketize_sparse_features` to prevent silent buffer overflows. Test Plan: # Full Suites # Requires: Any CUDA GPU (has CPU fallback) buck2 test fbcode//deeplearning/fbgemm/fbgemm_gpu/test/sparse:block_bucketize buck2 test fbcode//deeplearning/fbgemm/fbgemm_gpu/test/sparse:block_bucketize_2d_weights # Individual Tests (new in this diff) buck2 test fbcode//deeplearning/fbgemm/fbgemm_gpu/test/sparse:block_bucketize -- -k test_block_bucketize_sparse_features_total_num_blocks_not_divisible buck2 test fbcode//deeplearning/fbgemm/fbgemm_gpu/test/sparse:block_bucketize_2d_weights -- -k test_block_bucketize_sparse_features_2d_weights_total_num_blocks_not_divisible # Related Tests (validates existing behavior not broken) buck2 test fbcode//deeplearning/fbgemm/fbgemm_gpu/test/sparse:block_bucketize -- -k total_num_blocks # Benchmark (does NOT pass total_num_blocks, zero overhead on common path) buck2 run @//mode/opt fbcode//deeplearning/fbgemm/fbgemm_gpu/bench:sparse_ops -- block-bucketize-sparse-features-bench Reviewed By: henrylhtsang Differential Revision: D101141810 Pulled By: q10

meta-cla Bot added the cla signed label Apr 16, 2026

meta-codesync Bot added fb-exported meta-exported labels Apr 16, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Validate total_num_blocks divisibility by my_size in block_bucketize (#5646)#5649

Validate total_num_blocks divisibility by my_size in block_bucketize (#5646)#5649
q10 wants to merge 1 commit intopytorch:mainfrom
q10:export-D101141810

q10 commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

q10 commented Apr 16, 2026

Full Suites

Requires: Any CUDA GPU (has CPU fallback)

Individual Tests (new in this diff)

Related Tests (validates existing behavior not broken)

Benchmark (does NOT pass total_num_blocks, zero overhead on common path)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants