[IFU] Add automated issue creation workflow #3067
+138
−52
Open
ROCm Repo Management API / Tests / Tests / Test Distributed / Run pytorch_distributed_2
failed
Mar 13, 2026 in 0s
failed: 2, skipped: 31, passed: 180
failed: 2, skipped: 31, passed: 180
Details
TestDistBackendWithSpawn.test_ddp_apply_optim_in_backward
RuntimeError: Process 1 exited with error code 10 and exception:
Traceback (most recent call last):
File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 942, in run_test
getattr(self, test_name)()
File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 789, in wrapper
fn()
File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3364, in wrapper
method(*args, **kwargs)
File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 235, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/distributed/distributed_test.py", line 4880, in test_ddp_apply_optim_in_backward
self._test_ddp_apply_optim_in_backward(
File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/distributed/distributed_test.py", line 4862, in _test_ddp_apply_optim_in_backward
self.assertEqual(
File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 4359, in assertEqual
raise error_metas.pop()[0].to_error( # type: ignore[index]
AssertionError: Tensor-likes are not close!
Mismatched elements: 1 / 1024 (0.1%)
Greatest absolute difference: 2.193450927734375e-05 at index (858,) (up to 1e-05 allowed)
Greatest relative difference: 3.6764354263141286e-06 at index (858,) (up to 1.3e-06 allowed)
Params not equal at iteration 3
To execute this test, run the following from the base repo dir:
PYTORCH_TEST_WITH_ROCM=1 python test/distributed/test_distributed_spawn.py TestDistBackendWithSpawn.test_ddp_apply_optim_in_backward
This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
Stack trace
Traceback (most recent call last):
File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 787, in wrapper
self._join_processes(fn)
File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1057, in _join_processes
self._check_return_codes(fn, elapsed_time)
File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1097, in _check_return_codes
raise RuntimeError(error)
RuntimeError: Process 1 exited with error code 10 and exception:
Traceback (most recent call last):
File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 942, in run_test
getattr(self, test_name)()
File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 789, in wrapper
fn()
File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3364, in wrapper
method(*args, **kwargs)
File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 235, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/distributed/distributed_test.py", line 4880, in test_ddp_apply_optim_in_backward
self._test_ddp_apply_optim_in_backward(
File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/distributed/distributed_test.py", line 4862, in _test_ddp_apply_optim_in_backward
self.assertEqual(
File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 4359, in assertEqual
raise error_metas.pop()[0].to_error( # type: ignore[index]
AssertionError: Tensor-likes are not close!
Mismatched elements: 1 / 1024 (0.1%)
Greatest absolute difference: 2.193450927734375e-05 at index (858,) (up to 1e-05 allowed)
Greatest relative difference: 3.6764354263141286e-06 at index (858,) (up to 1.3e-06 allowed)
Params not equal at iteration 3
To execute this test, run the following from the base repo dir:
PYTORCH_TEST_WITH_ROCM=1 python test/distributed/test_distributed_spawn.py TestDistBackendWithSpawn.test_ddp_apply_optim_in_backward
This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
Standard error
/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/backends/cudnn/__init__.py:175: UserWarning: cuDNN Benchmark limit is not supported in MIOpen and will have no effect. (Triggered internally at /var/lib/jenkins/pytorch/torch/csrc/cuda/Module.cpp:1985.)
torch._C._cuda_set_cudnn_benchmark_limit(_benchmark_limit)
Standard out
Process 1 terminated with exit code 10, terminating remaining processes.
TestDistBackendWithSpawn.test_ddp_apply_optim_in_backward
RuntimeError: Process 0 exited with error code 10 and exception:
Traceback (most recent call last):
File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 942, in run_test
getattr(self, test_name)()
File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 789, in wrapper
fn()
File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3364, in wrapper
method(*args, **kwargs)
File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 235, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/distributed/distributed_test.py", line 4880, in test_ddp_apply_optim_in_backward
self._test_ddp_apply_optim_in_backward(
File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/distributed/distributed_test.py", line 4862, in _test_ddp_apply_optim_in_backward
self.assertEqual(
File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 4359, in assertEqual
raise error_metas.pop()[0].to_error( # type: ignore[index]
AssertionError: Tensor-likes are not close!
Mismatched elements: 6 / 3072 (0.2%)
Greatest absolute difference: 2.384185791015625e-05 at index (0, 512) (up to 1e-05 allowed)
Greatest relative difference: 2.2767704649595544e-05 at index (1, 448) (up to 1.3e-06 allowed)
Params not equal at iteration 3
To execute this test, run the following from the base repo dir:
PYTORCH_TEST_WITH_ROCM=1 python test/distributed/test_distributed_spawn.py TestDistBackendWithSpawn.test_ddp_apply_optim_in_backward
This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
Stack trace
Traceback (most recent call last):
File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 787, in wrapper
self._join_processes(fn)
File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1057, in _join_processes
self._check_return_codes(fn, elapsed_time)
File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1097, in _check_return_codes
raise RuntimeError(error)
RuntimeError: Process 0 exited with error code 10 and exception:
Traceback (most recent call last):
File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 942, in run_test
getattr(self, test_name)()
File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 789, in wrapper
fn()
File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3364, in wrapper
method(*args, **kwargs)
File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 235, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/distributed/distributed_test.py", line 4880, in test_ddp_apply_optim_in_backward
self._test_ddp_apply_optim_in_backward(
File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/distributed/distributed_test.py", line 4862, in _test_ddp_apply_optim_in_backward
self.assertEqual(
File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 4359, in assertEqual
raise error_metas.pop()[0].to_error( # type: ignore[index]
AssertionError: Tensor-likes are not close!
Mismatched elements: 6 / 3072 (0.2%)
Greatest absolute difference: 2.384185791015625e-05 at index (0, 512) (up to 1e-05 allowed)
Greatest relative difference: 2.2767704649595544e-05 at index (1, 448) (up to 1.3e-06 allowed)
Params not equal at iteration 3
To execute this test, run the following from the base repo dir:
PYTORCH_TEST_WITH_ROCM=1 python test/distributed/test_distributed_spawn.py TestDistBackendWithSpawn.test_ddp_apply_optim_in_backward
This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
Standard error
/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/backends/cudnn/__init__.py:175: UserWarning: cuDNN Benchmark limit is not supported in MIOpen and will have no effect. (Triggered internally at /var/lib/jenkins/pytorch/torch/csrc/cuda/Module.cpp:1985.)
torch._C._cuda_set_cudnn_benchmark_limit(_benchmark_limit)
Standard out
Process 0 terminated with exit code 10, terminating remaining processes.
Loading