Skip to content

Validate total_num_blocks divisibility by my_size in block_bucketize (#5646)#5649

Open
q10 wants to merge 1 commit intopytorch:mainfrom
q10:export-D101141810
Open

Validate total_num_blocks divisibility by my_size in block_bucketize (#5646)#5649
q10 wants to merge 1 commit intopytorch:mainfrom
q10:export-D101141810

Conversation

@q10
Copy link
Copy Markdown
Contributor

@q10 q10 commented Apr 16, 2026

Summary:
X-link: https://github.com/facebookresearch/FBGEMM/pull/2596

Add input validation for total_num_blocks divisibility by my_size in block_bucketize_sparse_features to prevent silent buffer overflows.

Test Plan:

Full Suites

Requires: Any CUDA GPU (has CPU fallback)

buck2 test fbcode//deeplearning/fbgemm/fbgemm_gpu/test/sparse:block_bucketize
buck2 test fbcode//deeplearning/fbgemm/fbgemm_gpu/test/sparse:block_bucketize_2d_weights

Individual Tests (new in this diff)

buck2 test fbcode//deeplearning/fbgemm/fbgemm_gpu/test/sparse:block_bucketize -- -k test_block_bucketize_sparse_features_total_num_blocks_not_divisible
buck2 test fbcode//deeplearning/fbgemm/fbgemm_gpu/test/sparse:block_bucketize_2d_weights -- -k test_block_bucketize_sparse_features_2d_weights_total_num_blocks_not_divisible

Related Tests (validates existing behavior not broken)

buck2 test fbcode//deeplearning/fbgemm/fbgemm_gpu/test/sparse:block_bucketize -- -k total_num_blocks

Benchmark (does NOT pass total_num_blocks, zero overhead on common path)

buck2 run @//mode/opt fbcode//deeplearning/fbgemm/fbgemm_gpu/bench:sparse_ops -- block-bucketize-sparse-features-bench

Reviewed By: henrylhtsang

Differential Revision: D101141810

Pulled By: q10

…ytorch#5646)

Summary:
X-link: facebookresearch/FBGEMM#2596

Add input validation for `total_num_blocks` divisibility by `my_size` in `block_bucketize_sparse_features` to prevent silent buffer overflows.


Test Plan:
# Full Suites
# Requires: Any CUDA GPU (has CPU fallback)
buck2 test fbcode//deeplearning/fbgemm/fbgemm_gpu/test/sparse:block_bucketize
buck2 test fbcode//deeplearning/fbgemm/fbgemm_gpu/test/sparse:block_bucketize_2d_weights

# Individual Tests (new in this diff)
buck2 test fbcode//deeplearning/fbgemm/fbgemm_gpu/test/sparse:block_bucketize -- -k test_block_bucketize_sparse_features_total_num_blocks_not_divisible
buck2 test fbcode//deeplearning/fbgemm/fbgemm_gpu/test/sparse:block_bucketize_2d_weights -- -k test_block_bucketize_sparse_features_2d_weights_total_num_blocks_not_divisible

# Related Tests (validates existing behavior not broken)
buck2 test fbcode//deeplearning/fbgemm/fbgemm_gpu/test/sparse:block_bucketize -- -k total_num_blocks

# Benchmark (does NOT pass total_num_blocks, zero overhead on common path)
buck2 run @//mode/opt fbcode//deeplearning/fbgemm/fbgemm_gpu/bench:sparse_ops -- block-bucketize-sparse-features-bench

Reviewed By: henrylhtsang

Differential Revision: D101141810

Pulled By: q10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants