Atomics on 16 bits: prevent reading 4 bytes for 2-byte locations. #3005
+37
−7
ROCm Repo Management API / Tests / Tests / Test Distributed / Run pytorch_distributed_2
failed
Feb 27, 2026 in 0s
failed: 2, skipped: 161, passed: 290
failed: 2, skipped: 161, passed: 290
Details
TestDistBackendWithSpawn.test_ddp_apply_optim_in_backward_ignored_params
RuntimeError: Process 0 terminated or timed out after 305.0975272655487 seconds
Stack trace
Traceback (most recent call last):
File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 775, in wrapper
self._join_processes(fn)
File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1045, in _join_processes
self._check_return_codes(fn, elapsed_time)
File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1090, in _check_return_codes
raise RuntimeError(
RuntimeError: Process 0 terminated or timed out after 305.0975272655487 seconds
Standard error
/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/backends/cudnn/__init__.py:175: UserWarning: cuDNN Benchmark limit is not supported in MIOpen and will have no effect. (Triggered internally at /var/lib/jenkins/pytorch/torch/csrc/cuda/Module.cpp:1973.)
torch._C._cuda_set_cudnn_benchmark_limit(_benchmark_limit)
Standard out
Timing out after 300 seconds and killing subprocesses.
TestDistBackendWithSpawn.test_ddp_apply_optim_in_backward_ignored_params
AssertionError: Scalars are not equal!
Expected 0 but got -6.
Absolute difference: 6
Relative difference: inf
Expected exit code 0 but got -6 for pid: 1112544
Stack trace
Traceback (most recent call last):
File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 775, in wrapper
self._join_processes(fn)
File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1045, in _join_processes
self._check_return_codes(fn, elapsed_time)
File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1121, in _check_return_codes
self.assertEqual(
File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 4359, in assertEqual
raise error_metas.pop()[0].to_error( # type: ignore[index]
AssertionError: Scalars are not equal!
Expected 0 but got -6.
Absolute difference: 6
Relative difference: inf
Expected exit code 0 but got -6 for pid: 1112544
Standard error
/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/backends/cudnn/__init__.py:175: UserWarning: cuDNN Benchmark limit is not supported in MIOpen and will have no effect. (Triggered internally at /var/lib/jenkins/pytorch/torch/csrc/cuda/Module.cpp:1973.)
torch._C._cuda_set_cudnn_benchmark_limit(_benchmark_limit)
Loading