Fix CUDA error 700 when docking with flexible residues by caic99 · Pull Request #176 · dptech-corp/Uni-Dock

caic99 · 2026-04-10T06:25:39Z

Summary

Fixes #159 — CUDA error 700 (cudaErrorIllegalAddress) when using --flex with --gpu_batch.

The CUDA kernel supports at most 1 flex torsion (MAX_NUM_OF_FLEX_TORSION = 1), but exceeding this limit crashed with an opaque CUDA error instead of a clear message. The crash had three root causes:

Wrong Config grouping (main.cpp): Ligand classification ignored receptor flex atoms/torsions. A combined model with 165 atoms was placed into SmallConfig (max 40), causing out-of-bounds GPU memory access → CUDA error 700.
Dead assert (monte_carlo.cu): assert(m.num_other_pairs() == 0) was compiled out by NDEBUG in Release builds, providing zero protection.
Missing flex conf in results (vina.h): cuda_to_vina didn't populate c.flex, so CPU-side pose refinement segfaulted on model.set(c).

Changes

File	Change
`main.cpp`	Fix grouping to include receptor flex atoms/torsions; add early exit with clear error when flex torsions exceed GPU kernel limit
`monte_carlo.cu`	Replace 4 dead asserts with comments noting the limitation
`vina.h`	Initialize flex conf from model before pose refinement

Behavior

Scenario	Before	After
21 flex residues (51 torsions)	`CUDA error code=700 cudaErrorIllegalAddress`	Clean error: "51 torsions, supports at most 1"
1 flex residue (1 torsion)	`CUDA error code=700`	Works correctly, docking completes
No flex	Works	Works (unchanged)

Test plan

Reproduced original CUDA error 700 with issue reporter's test files (2am9, 21 flex residues)
Verified fix produces clean error message for unsupported configs (EXIT=1, no crash)
Verified 1 flex residue (1 torsion) works end-to-end on GPU (SmallConfig, EXIT=0)
Verified normal non-flex docking is unchanged

🤖 Generated with Claude Code

The CUDA kernel supports at most MAX_NUM_OF_FLEX_TORSION (1) flex torsion, but using --flex with --gpu_batch crashed with CUDA error 700 instead of reporting the limitation. Root causes: 1. Ligand Config grouping (main.cpp) ignored receptor flex atoms and torsions, placing combined models (e.g. 165 atoms) into undersized Config groups (e.g. SmallConfig with max 40), causing out-of-bounds GPU memory access. 2. assert(m.num_other_pairs() == 0) was compiled out in Release builds (NDEBUG), providing no protection at all. 3. cuda_to_vina result conversion did not populate c.flex, causing segfault during CPU-side pose refinement when model.set(c) was called. This commit: - Fixes grouping to account for receptor flex atoms and torsions so that the correct Config is selected - Replaces the ineffective assert with a comment noting the limitation - Initializes flex conf from the model before pose refinement - Adds an early check that exits cleanly with a clear error message when flex torsions exceed the kernel's supported limit Single flex residue docking (1 torsion) now works correctly on GPU. Configurations exceeding the limit get a clear error instead of a CUDA crash. Refs dptech-corp#159 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

caic99 mentioned this pull request Apr 10, 2026

CUDA error 700 when docking with flexible residues #159

Open

Merge branch 'main' into fix/flex-residue-guard-159

e0294c6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix CUDA error 700 when docking with flexible residues#176

Fix CUDA error 700 when docking with flexible residues#176
caic99 wants to merge 2 commits intodptech-corp:mainfrom
caic99:fix/flex-residue-guard-159

caic99 commented Apr 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

caic99 commented Apr 10, 2026

Summary

Changes

Behavior

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant