Skip to content

fix(cc): handle nloc==0 in DeepSpinPTExpt with phantom-atom padding#5485

Open
wanghan-iapcm wants to merge 2 commits into
deepmodeling:masterfrom
wanghan-iapcm:fix-spin-pt2-empty-rank-fpe
Open

fix(cc): handle nloc==0 in DeepSpinPTExpt with phantom-atom padding#5485
wanghan-iapcm wants to merge 2 commits into
deepmodeling:masterfrom
wanghan-iapcm:fix-spin-pt2-empty-rank-fpe

Conversation

@wanghan-iapcm
Copy link
Copy Markdown
Collaborator

@wanghan-iapcm wanghan-iapcm commented Jun 2, 2026

Problem

Multi-rank spin MD can leave a rank with zero real local atoms (nloc_real == 0) when atoms migrate to other subdomains. The with-comm AOTI artifact hits an intermittent SIGFPE (integer divide by zero) at runtime in inductor-generated shape arithmetic that uses nloc as a divisor.

Reproduced on master CI run 26667802665:

Caught signal 8 (Floating point exception: integer divide by zero)
4  forward_lower_with_comm/.../wrapper.so(AOTInductorModel::run_impl+0xf482)

Root cause:

  • The graph was traced with nloc_min=1 (serialization.py:362) and inductor lowered an even stricter nloc >= 2 runtime-check (visible in the generated wrapper.cpp's check_input_3).
  • That runtime-check is gated by env var AOTI_RUNTIME_CHECK_INPUTS (default OFF), so with nloc = 0 the check is silently bypassed and the compiled graph runs through its own divide-by-zero on shape arithmetic.
  • Whether the offending divide is actually emitted depends on inductor's code-gen choices, which vary across compiles — hence the intermittent nature.

Fix

Prepend two phantom atoms with empty neighbour lists when nloc_real == 0 so the AOTI graph runs with nloc == 2 and never reaches the integer-divide-by-zero path. Phantoms have no neighbours so they contribute zero atomic energy / force / virial, preserving the physically-correct "this rank has no real atoms" result.

Key details (all in source/api_cc/src/DeepSpinPTExpt.cc):

  • dcoord / datype / dspin get two zero-valued rows prepended.
  • firstneigh_tensor gets two -1 rows prepended (no neighbours).
  • mapping_tensor gets two identity entries prepended.
  • comm_dict.nlocal is set to 2 (not the LAMMPS-reported 0) so border_op writes received ghost features past the phantom slots.
  • Output arrays (dforce, dforce_mag, datom_energy, datom_virial) get the phantom prefix stripped before being scattered back to LAMMPS via select_map.

Why phantoms rather than Dim(min=0) re-export

Bumping the trace constraint to min=0 would require:

  1. auditing every nloc-dependent divide in deepmd/dpmodel/{descriptor,fitting,model}/ and protecting with xp.maximum(nloc, 1);
  2. torch.export re-emitting compatible guards (currently fails because spin-side shape relationships require nloc >= 1 to be inferable);
  3. inductor cooperating with the relaxed bound (it makes independent specialization choices downstream);
  4. re-exporting every .pt2 archive in source/tests/infer/.

The phantom approach is a strict superset of correctness and self-contained in one C++ file. The two approaches aren't mutually exclusive — the min=0 route can land as a follow-up once the dpmodel audit is done.

Test plan

  • Local CPU rebuild + runUnitTests_cc --gtest_filter='*Spin*': 42 / 42 spin C++ regression tests pass (12 TF-backend tests skipped, as expected in the PT-only venv).
  • CI: the multi-rank LAMMPS test test_pair_deepmd_mpi_dpa3_spin_empty_subdomain should now pass deterministically. Local Python LAMMPS-MPI verification is blocked by a pre-existing OpenMPI/MPICH ABI mismatch in my local venv (the plugin's ompi_mpi_* symbols can't resolve against MPICH's libmpi.so.12), so end-to-end verification falls to CI.

Known limitations

  • The phantom path is structurally inert for nloc_real > 0 (the if (phantom_n > 0) branch never fires), so the common path is unchanged.
  • If a future inductor version bumps the nloc lower-bound to >2, phantom_n will need to track that minimum.
  • This fix is in DeepSpinPTExpt only. The corresponding non-spin path in DeepPotPTExpt has the same code shape; non-spin DPA3 empty-subdomain currently passes in CI but could regress similarly with a future inductor change. Deferred to a follow-up if observed.
  • Supersedes test(lammps): skip spin DPA3 empty-subdomain pt2 case #5478 (which proposed skipping the test); this PR fixes the underlying bug instead.

Summary by CodeRabbit

  • Bug Fixes
    • Enhanced stability and robustness for distributed molecular dynamics simulations with uneven atom allocation across processors. Resolved critical crashes occurring when certain processors contain no locally-assigned atoms but must still participate in force field calculations, neighbor list construction, and energy computations. Fixed output result inconsistencies across processor ranks.

Multi-rank spin MD can leave a rank with zero real local atoms when
all atoms migrate to other subdomains.  The with-comm AOTI artifact
hits an intermittent SIGFPE (integer divide by zero) at runtime in
inductor-generated shape arithmetic that uses nloc as a divisor.

The graph was traced with nloc_min=1 and inductor lowered an even
stricter nloc>=2 runtime-check which is silently bypassed because
AOTI_RUNTIME_CHECK_INPUTS is unset by default.  Whether the offending
divide is actually emitted depends on inductor's code-gen choices,
which vary across compiles -- hence the random nature of the failure
(reproduced on CI run 26667802665).

Fix: prepend two phantom atoms with empty neighbour lists ahead of
the real atoms when nloc_real==0.  The AOTI graph then runs with
nloc==2, satisfying the inductor specialisation.  Phantoms have no
neighbours so they contribute zero atomic energy / force / virial,
preserving the physically-correct 'this rank has no real atoms'
result.  comm_dict's nlocal is set to 2 so border_op writes received
ghost features past the phantom slots; outputs are stripped of the
phantom prefix before being scattered back to LAMMPS via select_map.
@dosubot dosubot Bot added the bug label Jun 2, 2026
@github-actions github-actions Bot added C++ and removed bug labels Jun 2, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 47f15b41a6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

// received ghost features past the phantom slots; outputs are stripped
// of the phantom prefix before being scattered back to LAMMPS atoms
// via ``select_map``.
const int phantom_n = (nloc_real == 0 && nall_real > 0) ? 2 : 0;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Subtract phantom atoms before returning energy

When an empty-subdomain rank takes this new path, the two inserted type-0 phantom atoms are included in the model's energy_redu, and that reduced energy is assigned directly before any phantom prefix is stripped. DeepMD energy fitting nets can add per-type biases / nonzero zero-neighbor outputs, so phantoms are not guaranteed to have zero atomic energy; in those models every rank with nloc_real == 0 will contribute extra phantom energy to the MPI-reduced LAMMPS total. Please either remove/subtract the phantom atomic energies from energy_redu or make the model see them through a mask that excludes them from reductions.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in commit c43468fb7 (force-pushed). I went one step beyond a phantom-only subtract: a rank with no real local atoms contributes zero to the total energy by definition, so I just zero ener directly when phantom_n > 0.

The simpler-subtract approach would have been incomplete here — the spin path doubles atoms internally (real + spin half), so both halves carry the per-type-bias and zero-neighbour MLP output into energy_redu. output_map["energy"] only exposes the real half after the SpinModel's [:, :nloc] slice, so subtracting that alone would leave the spin half leaking through. (Confirmed empirically: an earlier attempt to subtract the real half showed mpi-2=-2.45 vs mpi-1=-1.49 in CI run 26796476553. The zero-out has no such residual.)

Forces / force_mag / virial don't need an analogous correction because phantom atomic outputs are coord-independent (no neighbours) so their derivatives are zero.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jun 2, 2026

Review Change Stack

📝 Walkthrough

Walkthrough

This PR modifies DeepSpinPTExpt::compute to handle ranks with zero real local atoms by prepending phantom atoms for model inputs, adjusting mapping and neighbor-list tensors so phantoms are isolated, zeroing reduced energy when needed, and stripping phantom entries from force and atomic output tensors before returning to LAMMPS.

Changes

Phantom-atom padding for zero-local-atoms corner case

Layer / File(s) Summary
Phantom-atom padding initialization
source/api_cc/src/DeepSpinPTExpt.cc
When nloc_real == 0 and nall_real > 0, prepend two phantom atoms with zero values; update nall_real, nloc_real, and nloc; reconstruct spin tensor so phantom spins remain zero and real atoms are populated via bkw_map with an offset.
Tensor mappings and neighbor-list preparation
source/api_cc/src/DeepSpinPTExpt.cc
Rebuild mapping_tensor so phantom slots use identity mappings and real slots are adjusted for the phantom prefix; prepend phantom_n neighbor-list rows filled with -1 to firstneigh_tensor so the model sees padded nloc while phantom atoms have no neighbors.
Post-inference energy zeroing
source/api_cc/src/DeepSpinPTExpt.cc
When phantom padding was applied, explicitly zero the reduced energy array ener after inference to prevent phantom contributions from affecting MPI-reduced totals.
Output post-processing and phantom removal
source/api_cc/src/DeepSpinPTExpt.cc
After inference, strip phantom-prefixed entries from dforce and dforce_mag and decrement nall_real; when atomic outputs are enabled, remove phantom-prefixed entries from datom_energy and datom_virial (including correct virial scaling) so per-atom outputs correspond only to real atoms.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely summarizes the main fix: handling the edge case where nloc==0 in DeepSpinPTExpt using phantom-atom padding, which directly addresses the SIGFPE issue described in the objectives.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@source/api_cc/src/DeepSpinPTExpt.cc`:
- Around line 487-489: The mapping assignment for real atoms fails to account
for phantom padding: when phantom_n > 0 the indices returned by fwd_map(...) are
in the pre-padding space and must be shifted by phantom_n. Update the loop that
sets mapping[ii] (which uses fwd_map[lmp_list.mapping[bkw_map[ii - phantom_n]]])
to add phantom_n to the fwd_map result so real-atom targets point into the
post-padding index range; keep all other indexing (ii, bkw_map,
lmp_list.mapping) unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: e4e67575-7512-48ab-a532-4f17dad67d93

📥 Commits

Reviewing files that changed from the base of the PR and between c67b7f7 and 47f15b4.

📒 Files selected for processing (1)
  • source/api_cc/src/DeepSpinPTExpt.cc

Comment on lines +487 to 489
for (int ii = phantom_n; ii < nall_real; ii++) {
mapping[ii] = fwd_map[lmp_list.mapping[bkw_map[ii - phantom_n]]];
}
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot Jun 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Missing phantom_n offset in mapping values for real atoms.

When phantom_n > 0, fwd_map[...] returns indices in the pre-padding coordinate space [0, nall_real_orig). After phantom padding, real atoms are shifted to indices [phantom_n, nall_real), so the mapping values must be offset accordingly.

For example, with phantom_n=2: if fwd_map[...] returns 0 (first real ghost in pre-padding space), the mapping should point to index 2 (first real ghost in post-padding space), not 0 (which is now a phantom slot).

🐛 Proposed fix
     for (int ii = phantom_n; ii < nall_real; ii++) {
-      mapping[ii] = fwd_map[lmp_list.mapping[bkw_map[ii - phantom_n]]];
+      mapping[ii] = fwd_map[lmp_list.mapping[bkw_map[ii - phantom_n]]] + phantom_n;
     }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
for (int ii = phantom_n; ii < nall_real; ii++) {
mapping[ii] = fwd_map[lmp_list.mapping[bkw_map[ii - phantom_n]]];
}
for (int ii = phantom_n; ii < nall_real; ii++) {
mapping[ii] = fwd_map[lmp_list.mapping[bkw_map[ii - phantom_n]]] + phantom_n;
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@source/api_cc/src/DeepSpinPTExpt.cc` around lines 487 - 489, The mapping
assignment for real atoms fails to account for phantom padding: when phantom_n >
0 the indices returned by fwd_map(...) are in the pre-padding space and must be
shifted by phantom_n. Update the loop that sets mapping[ii] (which uses
fwd_map[lmp_list.mapping[bkw_map[ii - phantom_n]]]) to add phantom_n to the
fwd_map result so real-atom targets point into the post-padding index range;
keep all other indexing (ii, bkw_map, lmp_list.mapping) unchanged.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not reachable in current code, so I'm not applying the suggestion. The reasoning:

  1. The if (lmp_list.mapping) branch is single-rank-only. Both pair_deepmd.cpp:194 and pair_deepspin.cpp:204-257 gate lmp_list.set_mapping(...) on comm->nprocs == 1. So in multi-rank, lmp_list.mapping == nullptr and we take the else (identity-fallback) branch instead.

  2. phantom_n > 0 is multi-rank-only — it only fires when nloc_real == 0 && nall_real > 0, which can only happen in a multi-rank LAMMPS run with an empty subdomain.

  3. Combining: phantom_n > 0 implies multi-rank, which implies lmp_list.mapping == nullptr, which means the branch you're flagging is never entered. The identity-fallback branch handles phantoms correctly because mapping[ii] = ii is self-consistent regardless of phantom offsets, and the with-comm AOTI artifact ignores mapping for ghost feature exchange anyway (border_op does that).

If a future change ever populates lmp_list.mapping in multi-rank, the fix would become applicable. Until then, applying it would be defensive against an unreachable scenario.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wanghan-iapcm, that's a clear and well-documented analysis — thank you. The mutual exclusivity is airtight:

  • lmp_list.mapping != nullptr ↔ single-rank (nprocs == 1)
  • phantom_n > 0 ↔ multi-rank empty subdomain (nloc_real == 0 && nall_real > 0)

So the two conditions can never hold simultaneously in the current LAMMPS pair-style code, and the branch is indeed unreachable. I'll withdraw the suggestion and record this constraint to avoid flagging it again.

(ᵒ ᵕ ᵒ)ᵇ


✏️ Learnings added
Learnt from: wanghan-iapcm
Repo: deepmodeling/deepmd-kit PR: 5485
File: source/api_cc/src/DeepSpinPTExpt.cc:487-489
Timestamp: 2026-06-02T06:34:42.158Z
Learning: In `source/api_cc/src/DeepSpinPTExpt.cc`, the combination `(phantom_n > 0) && (lmp_list.mapping != nullptr)` is structurally unreachable:
- `lmp_list.mapping` is populated only when `comm->nprocs == 1` (single-rank), gated in `source/lmp/pair_deepmd.cpp:194` and `source/lmp/pair_deepspin.cpp:204-257`.
- `phantom_n > 0` only fires when `nloc_real == 0 && nall_real > 0`, which can only occur in a multi-rank LAMMPS run with an empty subdomain.
Therefore the `if (lmp_list.mapping)` block inside the `ago == 0` section is never entered when phantom padding is active; the identity-fallback `else` branch handles that case correctly. Do not flag the missing `+ phantom_n` offset in the `lmp_list.mapping` branch as a bug.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


🧠 Learnings used
Learnt from: wanghan-iapcm
Repo: deepmodeling/deepmd-kit PR: 5407
File: source/api_cc/src/DeepSpinPTExpt.cc:135-151
Timestamp: 2026-04-20T13:36:43.594Z
Learning: In deepmd-kit, `.pt2` model files do not include any version/schema field. When reviewing code in `*PTExpt.cc` that reads `.pt2` metadata, do not add or require a version/schema lookup/expectation. Ensure backward compatibility is implemented via metadata fallback defaults—e.g., if `do_atomic_virial` is missing, default it to `true`, and if `nnei` metadata is missing, default `nnei` to `sum(sel)`—so older exported `.pt2` models continue to work.

@codecov
Copy link
Copy Markdown

codecov Bot commented Jun 2, 2026

Codecov Report

❌ Patch coverage is 75.00000% with 8 lines in your changes missing coverage. Please review.
✅ Project coverage is 81.36%. Comparing base (c67b7f7) to head (c43468f).

Files with missing lines Patch % Lines
source/api_cc/src/DeepSpinPTExpt.cc 75.00% 2 Missing and 6 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5485      +/-   ##
==========================================
- Coverage   81.36%   81.36%   -0.01%     
==========================================
  Files         868      868              
  Lines       96437    96463      +26     
  Branches     4233     4241       +8     
==========================================
+ Hits        78463    78484      +21     
+ Misses      16674    16673       -1     
- Partials     1300     1306       +6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Codex flagged on PR deepmodeling#5485: phantoms have constant atomic-energy
outputs that flow into 'energy_redu'.  On the spin path the SpinModel
doubles atoms internally, so both real and spin phantom halves
contribute -- and 'output_map["energy"]' only exposes the real half
after the '[:, :nloc]' slice.  Subtracting only that real half (a
first attempt) left the spin half leaking into the MPI-reduced LAMMPS
total: CI run 26796476553 showed mpi-2 = -2.45 vs mpi-1 ref = -1.49.

Simpler exact fix: a rank with no real local atoms contributes zero
to the total energy by definition.  The phantoms are pure scaffolding
to satisfy inductor's nloc>=2 specialisation; their fitting output is
a numerical artifact, not physics.  Zero 'ener' directly when
phantom_n > 0.

Forces / force_mag / virial are unaffected because phantom outputs
are coord-independent (no neighbours) so their derivatives are zero --
no analogous correction is needed there.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
source/api_cc/src/DeepSpinPTExpt.cc (1)

393-401: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Pad aparam_ when you synthesize phantom locals.

Lines 398-400 change the local-atom count from 0 to 2, but aparam_ stays in the pre-padding state. On an empty-local rank with daparam > 0, Lines 551-560 still send an empty tensor, so this path can still break for spin models that were exported with atomic parameters.

🧩 Suggested fix
   if (phantom_n > 0) {
     dcoord.insert(dcoord.begin(), static_cast<size_t>(phantom_n) * 3,
                   static_cast<VALUETYPE>(0));
     datype.insert(datype.begin(), static_cast<size_t>(phantom_n), 0);
+    if (daparam > 0) {
+      aparam_.insert(aparam_.begin(),
+                     static_cast<size_t>(phantom_n) * daparam,
+                     static_cast<VALUETYPE>(0));
+    }
     nall_real += phantom_n;
     nloc_real = phantom_n;
     nloc = nall_real - nghost_real;
   }

Also applies to: 550-560

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@source/api_cc/src/DeepSpinPTExpt.cc` around lines 393 - 401, When
synthesizing phantom locals (phantom_n > 0) you must pad the per-atom parameter
array aparam_ to match the new phantom atoms: insert
static_cast<size_t>(phantom_n) * static_cast<size_t>(daparam) default-valued
entries at the front of aparam_ (similar to how dcoord and datype are padded) so
subsequent send logic (the path around the existing tensor sends) sees the
correct size; perform this insert in the same block that updates dcoord, datype,
nall_real, nloc_real, and nloc.
♻️ Duplicate comments (1)
source/api_cc/src/DeepSpinPTExpt.cc (1)

487-489: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Offset real-atom mapping targets into the post-padding index space.

This is still using fwd_map[...] from the pre-padding layout. After Lines 395-400 prepend two phantom rows, every real/ghost row has moved by phantom_n, so the current mapping can resolve to a phantom slot instead of the intended atom row.

🐛 Minimal fix
       for (int ii = phantom_n; ii < nall_real; ii++) {
-        mapping[ii] = fwd_map[lmp_list.mapping[bkw_map[ii - phantom_n]]];
+        mapping[ii] =
+            fwd_map[lmp_list.mapping[bkw_map[ii - phantom_n]]] + phantom_n;
       }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@source/api_cc/src/DeepSpinPTExpt.cc` around lines 487 - 489, The mapping loop
uses fwd_map from the pre-padding layout so indices can point into phantom rows;
update the resolved index by offsetting into the post-padding layout by
phantom_n. Concretely, in the loop that sets mapping[ii] (which references
fwd_map, lmp_list.mapping and bkw_map), add phantom_n to the index into fwd_map
(e.g. use fwd_map[ lmp_list.mapping[bkw_map[ii - phantom_n]] + phantom_n ] or
otherwise shift the resolved value by phantom_n) so every real/ghost row maps
into the post-padding index space.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@source/api_cc/src/DeepSpinPTExpt.cc`:
- Around line 393-401: When synthesizing phantom locals (phantom_n > 0) you must
pad the per-atom parameter array aparam_ to match the new phantom atoms: insert
static_cast<size_t>(phantom_n) * static_cast<size_t>(daparam) default-valued
entries at the front of aparam_ (similar to how dcoord and datype are padded) so
subsequent send logic (the path around the existing tensor sends) sees the
correct size; perform this insert in the same block that updates dcoord, datype,
nall_real, nloc_real, and nloc.

---

Duplicate comments:
In `@source/api_cc/src/DeepSpinPTExpt.cc`:
- Around line 487-489: The mapping loop uses fwd_map from the pre-padding layout
so indices can point into phantom rows; update the resolved index by offsetting
into the post-padding layout by phantom_n. Concretely, in the loop that sets
mapping[ii] (which references fwd_map, lmp_list.mapping and bkw_map), add
phantom_n to the index into fwd_map (e.g. use fwd_map[
lmp_list.mapping[bkw_map[ii - phantom_n]] + phantom_n ] or otherwise shift the
resolved value by phantom_n) so every real/ghost row maps into the post-padding
index space.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 1b0da563-36d4-460d-8300-e08ca6d9f052

📥 Commits

Reviewing files that changed from the base of the PR and between 47f15b4 and c43468f.

📒 Files selected for processing (1)
  • source/api_cc/src/DeepSpinPTExpt.cc

@njzjz-bot
Copy link
Copy Markdown
Contributor

Thanks for the update. CI is green now, but I think two small correctness issues still need to be addressed before approval:

  1. The mapping tensor still uses pre-padding indices for real/ghost atoms:
mapping[ii] = fwd_map[lmp_list.mapping[bkw_map[ii - phantom_n]]];

When phantom_n > 0, the real/ghost rows have been shifted by the phantom prefix, so the resolved target should be shifted into the post-padding index space, e.g. add + phantom_n to the resolved fwd_map[...] value.

  1. If daparam > 0, aparam_ also needs phantom padding. dcoord and datype are prepended with phantom local atoms, but aparam_ remains in the pre-padding layout; the later atomic-parameter tensor path can therefore still see an empty/mismatched per-atom parameter array on empty-local ranks.

Could you please add a small follow-up commit for these two cases? After that this should be good to approve.

— OpenClaw 2026.5.28 (model: gpt-5.5)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants