fix(blueprint): chown copied files to sandbox user after restore (#1229) by ColinM-sys · Pull Request #1667 · NVIDIA/NemoClaw

ColinM-sys · 2026-04-09T04:28:40Z

Summary

After restoreIntoSandbox() calls openshell sandbox cp, run a best-effort chown -R sandbox:sandbox /sandbox/.openclaw-data so the writable side of the symlink tree is owned by the sandbox user.
Failure of the chown does not fail the restore (best-effort hardening).

Why

openshell sandbox cp runs as root inside the pod, so files copied into /sandbox/.openclaw via restoreIntoSandbox land as root:root. The symlinks under /sandbox/.openclaw point at the writable /sandbox/.openclaw-data tree, so without an explicit chown the agent workspace and per-agent runtime dirs end up unwritable by the sandbox user. The reporter saw:

EACCES: permission denied, open ".../agents/maggie_agent/agent/models.json"
Write: to ~/.openclaw-data/workspace-maggie_agent/MEETINGS.md failed

Test plan

Manual review of the changed call site to confirm the cp success path is preserved exactly and that chown failure cannot flip the result.
Note on automated tests: nemoclaw/src/blueprint/snapshot.test.ts has pre-existing vitest 4.x mocking breakage on main — 8 of its 16 tests fail before this change because the vi.mock("node:fs", ...) harness doesn't return what the production code expects under vitest 4.x. The failure surfaces as expected null not to be null from createSnapshot() and identical patterns in restoreIntoSandbox. This change adds zero new failures (8/16 fail before, still 8/16 fail after — same tests). I held back regression tests for the chown call shape because they would have hit the same mock-fs issue. Happy to add them in a follow-up PR once the snapshot test harness is repaired (which is itself worth tracking as a separate issue).

Summary by CodeRabbit

Bug Fixes
- Restore now fails immediately if the initial sandbox copy encounters an error, preventing partial restores.
- After a successful copy, sandbox data ownership is corrected in a best-effort step; ownership-fix failures no longer block the restore.
- Improved reliability of sandbox file handling and post-restore permissions.

Signed-off-by: ColinM-sys cmcdonough@50words.com

coderabbitai · 2026-04-09T04:28:57Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 3d2da528-fbae-44a8-9437-9f08cd019425

📥 Commits

Reviewing files that changed from the base of the PR and between 58c8278 and 80f59f6.

📒 Files selected for processing (1)

nemoclaw/src/blueprint/snapshot.ts

📝 Walkthrough

Walkthrough

restoreIntoSandbox() now returns false immediately when the initial openshell sandbox cp exits with a non-zero exitCode. If the copy succeeds, it runs a best-effort openshell sandbox exec <sandboxName> -- chown -R sandbox:sandbox /sandbox/.openclaw-data with { reject: false }, then returns true regardless of the chown outcome.

Changes

Cohort / File(s)	Summary
Ownership Correction & Control Flow `nemoclaw/src/blueprint/snapshot.ts`	On `openshell sandbox cp` non-zero exit, return `false` immediately. After successful copy, invoke `openshell sandbox exec <sandboxName> -- chown -R sandbox:sandbox /sandbox/.openclaw-data` as a best-effort step with `{ reject: false }` (ignore chown errors), then return `true`.

Sequence Diagram(s)

sequenceDiagram
    participant CL as Caller
    participant OS as OpenShell CLI
    participant SB as Sandbox (container filesystem)

    CL->>OS: sandbox cp <bundle> <sandbox>:/sandbox/.openclaw
    alt cp fails (exitCode != 0)
        OS-->>CL: non-zero exitCode -> restoreIntoSandbox returns false
    else cp succeeds
        OS-->>SB: files copied as root:root
        CL->>OS: sandbox exec <sandbox> -- chown -R sandbox:sandbox /sandbox/.openclaw-data (reject:false)
        OS-->>CL: chown result (ignored)
        CL-->>CL: restoreIntoSandbox returns true
    end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 I hopped inside the sandbox lair,

I nudged the roots with gentle care,
A chown attempt, though small and meek,
Files now kinder to those who seek,
Hooray — migrations smile this week! 🥕

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and concisely summarizes the main change: adding a chown operation after the restore process to fix sandbox user file ownership.
Linked Issues check	✅ Passed	The pull request fully addresses the requirements in issue `#1229` by implementing a chown step after openshell sandbox cp with non-fatal failure handling as suggested.
Out of Scope Changes check	✅ Passed	All changes in the pull request are directly scoped to fixing issue `#1229`; no unrelated modifications were introduced.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@nemoclaw/src/blueprint/snapshot.ts`:
- Around line 110-123: Add regression tests that cover the new best-effort chown
branch: in tests for the code that calls execa("openshell", ["sandbox","exec",
sandboxName, "--", "chown", "-R", "sandbox:sandbox",
"/sandbox/.openclaw-data"]), assert that the second execa invocation is executed
(mock/stub execa and verify it was called with the exact args including
sandboxName and "/sandbox/.openclaw-data") and assert that when that chown
invocation fails (simulate non-zero exit via reject:false behavior or a thrown
error) the surrounding function still returns true; add tests co-located with
snapshot.ts (and follow the project test pattern for
{nemoclaw/src/blueprint,bin/lib}/*) so the security-sensitive ownership
correction path is covered.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 3dac8d77-f735-418a-848a-bb62e8e7e9e0

📥 Commits

Reviewing files that changed from the base of the PR and between e8b30a2 and 1a18f62.

📒 Files selected for processing (1)

nemoclaw/src/blueprint/snapshot.ts

coderabbitai · 2026-04-09T04:32:35Z

+  await execa(
+    "openshell",
+    [
+      "sandbox",
+      "exec",
+      sandboxName,
+      "--",
+      "chown",
+      "-R",
+      "sandbox:sandbox",
+      "/sandbox/.openclaw-data",
+    ],
+    { reject: false },
+  );


⚠️ Potential issue | 🟠 Major

Add regression tests for the new best-effort chown branch.

Line 110-Line 123 introduces a security-sensitive ownership correction, but there’s no co-located test asserting (1) the second execa call is made and (2) a failing chown still returns true.

Suggested test additions

diff --git a/nemoclaw/src/blueprint/snapshot.test.ts b/nemoclaw/src/blueprint/snapshot.test.ts @@ it("calls openshell sandbox cp and returns true on success", async () => { addDir(`${SNAP}/openclaw`); mockExeca.mockResolvedValue({ exitCode: 0 }); expect(await restoreIntoSandbox(SNAP, "mybox")).toBe(true); @@ }); + it("runs best-effort chown after successful copy", async () => { + addDir(`${SNAP}/openclaw`); + mockExeca + .mockResolvedValueOnce({ exitCode: 0 }) // cp + .mockResolvedValueOnce({ exitCode: 1 }); // chown (ignored) + + expect(await restoreIntoSandbox(SNAP, "mybox")).toBe(true); + expect(mockExeca).toHaveBeenNthCalledWith( + 2, + "openshell", + [ + "sandbox", + "exec", + "mybox", + "--", + "chown", + "-R", + "sandbox:sandbox", + "/sandbox/.openclaw-data", + ], + { reject: false }, + ); + });

As per coding guidelines {nemoclaw/src/blueprint,bin/lib}/**/*.{js,ts}: Security-sensitive code paths in isolation/sandbox features must have extra test coverage to prevent credential leaks and sandbox escapes.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

await execa(

"openshell",

[

"sandbox",

"exec",

sandboxName,

"--",

"chown",

"-R",

"sandbox:sandbox",

"/sandbox/.openclaw-data",

],

{ reject: false },

);

it("calls openshell sandbox cp and returns true on success", async () => {

addDir(`${SNAP}/openclaw`);

mockExeca.mockResolvedValue({ exitCode: 0 });

expect(await restoreIntoSandbox(SNAP, "mybox")).toBe(true);

});

it("runs best-effort chown after successful copy", async () => {

addDir(`${SNAP}/openclaw`);

mockExeca

.mockResolvedValueOnce({ exitCode: 0 }) // cp

.mockResolvedValueOnce({ exitCode: 1 }); // chown (ignored)

expect(await restoreIntoSandbox(SNAP, "mybox")).toBe(true);

expect(mockExeca).toHaveBeenNthCalledWith(

2,

"openshell",

[

"sandbox",

"exec",

"mybox",

"--",

"chown",

"-R",

"sandbox:sandbox",

"/sandbox/.openclaw-data",

],

{ reject: false },

);

});

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@nemoclaw/src/blueprint/snapshot.ts` around lines 110 - 123, Add regression tests that cover the new best-effort chown branch: in tests for the code that calls execa("openshell", ["sandbox","exec", sandboxName, "--", "chown", "-R", "sandbox:sandbox", "/sandbox/.openclaw-data"]), assert that the second execa invocation is executed (mock/stub execa and verify it was called with the exact args including sandboxName and "/sandbox/.openclaw-data") and assert that when that chown invocation fails (simulate non-zero exit via reject:false behavior or a thrown error) the surrounding function still returns true; add tests co-located with snapshot.ts (and follow the project test pattern for {nemoclaw/src/blueprint,bin/lib}/*) so the security-sensitive ownership correction path is covered.

wscurran · 2026-04-09T14:10:07Z

✨ Thanks for submitting this PR, which proposes a way to fix the file ownership issue in the sandbox and may help resolve issues related to file access and permissions.

Possibly related open issues:

#1229 [bug] Named agent workspace dirs created as root:root via openshell sandbox cp (no chown after restore)

coderabbitai

🧹 Nitpick comments (1)

nemoclaw/src/blueprint/snapshot.ts (1)

110-123: Consider logging chown failures for observability.

The chown is intentionally best-effort, but discarding the result entirely makes it hard to diagnose when writes later fail due to ownership issues. A debug-level log on non-zero exit would aid troubleshooting without violating the "don't fail the restore" semantics.

💡 Optional: capture and log chown result

-  await execa(
+  const chownResult = await execa(
     "openshell",
     [
       "sandbox",
       "exec",
       sandboxName,
       "--",
       "chown",
       "-R",
       "sandbox:sandbox",
       "/sandbox/.openclaw-data",
     ],
     { reject: false },
   );
+  if (chownResult.exitCode !== 0) {
+    // Best-effort: log but don't fail
+    console.debug(
+      `chown in sandbox ${sandboxName} exited ${chownResult.exitCode}: ${chownResult.stderr}`,
+    );
+  }
   return true;

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@nemoclaw/src/blueprint/snapshot.ts` around lines 110 - 123, Capture the
result of the execa call in snapshot.ts (e.g., const result = await execa(...))
instead of discarding it, then check result.exitCode (or result.failed) and, on
non-zero exit, emit a debug-level log including sandboxName and the command
output (result.stderr / result.stdout) so chown failures are observable while
preserving the best-effort semantics of the existing chown call.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@nemoclaw/src/blueprint/snapshot.ts`:
- Around line 110-123: Capture the result of the execa call in snapshot.ts
(e.g., const result = await execa(...)) instead of discarding it, then check
result.exitCode (or result.failed) and, on non-zero exit, emit a debug-level log
including sandboxName and the command output (result.stderr / result.stdout) so
chown failures are observable while preserving the best-effort semantics of the
existing chown call.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 11a498b2-76ae-4c38-b3da-2ce5056bbe7b

📥 Commits

Reviewing files that changed from the base of the PR and between 1a18f62 and 58c8278.

📒 Files selected for processing (1)

nemoclaw/src/blueprint/snapshot.ts

wscurran · 2026-04-15T00:59:10Z

We appreciate this fix — ensuring copied files are chowned to the sandbox user after restore is the kind of detail that prevents subtle permission failures at runtime. The codebase has changed significantly since this was opened, including a TypeScript migration (#1673) and OpenShell updates beyond v0.0.16. Could you rebase onto the current main? And since you also have #1656, #1676, and #1677 open, a joint rebase across all four would be ideal.

`openshell sandbox cp` runs as root inside the pod, so files copied into /sandbox/.openclaw via restoreIntoSandbox land as root:root. The symlinks under /sandbox/.openclaw point at the writable /sandbox/.openclaw-data tree, so without an explicit chown the agent workspace and per-agent runtime dirs end up unwritable by the sandbox user. That broke writes to models.json, agent state, and workspace markdown files with `EACCES: permission denied`. After a successful cp, run a best-effort recursive `chown -R sandbox:sandbox /sandbox/.openclaw-data` via `openshell sandbox exec`. The chown is intentionally best-effort (its failure does NOT flip the restore result) so a future runtime that already gets ownership right won't break the migration. Tests: snapshot.test.ts has pre-existing vitest 4.x mocking breakage on main (8/16 tests fail before this change, same 8/16 fail after — no new regressions). The mock-fs harness needs a separate fix; once that lands, regression tests can be added that assert the chown call shape and the best-effort semantics. Refs: NVIDIA#1229 Signed-off-by: ColinM-sys <cmcdonough@50words.com>

ColinM-sys · 2026-04-15T02:23:35Z

Rebased onto current main along with #1656, #1676, and #1677. Thank you!

ericksoa

LGTM — correct fix for the root:root ownership bug after sandbox cp. Best-effort chown semantics are right. We'll follow up separately with debug logging and test coverage once the snapshot test harness is repaired.

…store Follow-up to #1667 — the chown-after-restore path had no test coverage and silently discarded failures. Add three tests (chown called, chown failure still returns true, chown skipped on cp failure) and a console.debug breadcrumb when chown exits non-zero. Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

…store (#1899) ## Summary Follow-up to #1667 — the best-effort chown-after-restore path had no test coverage and silently discarded failures. - Add three tests for the chown behavior in `restoreIntoSandbox`: - chown is called after successful cp with correct args - chown failure still returns true (best-effort semantics) - chown is not called when cp fails - Add `console.debug` breadcrumb when chown exits non-zero, so ownership failures leave a trace instead of vanishing silently ## Test plan - [x] `npx vitest run nemoclaw/src/blueprint/snapshot.test.ts` — 19/19 pass - [x] `npm run build` (plugin) — clean - [x] `make check` — all hooks pass - [x] Pre-existing `test/policies.test.ts` failures (15) confirmed on main — unrelated Signed-off-by: Aaron Erickson <aerickson@nvidia.com>  ## Summary by CodeRabbit ## Release Notes * **Improvements** * Enhanced sandbox restoration process with improved logging and error handling for permission operations. * **Tests** * Added comprehensive test coverage for sandbox restoration scenarios, including success, partial failure, and error cases. * **Documentation** * Added documentation for AI-assisted label triage workflow and guidelines.  Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

ColinM-sys · 2026-04-15T05:59:48Z

Thanks Aaron! Glad the best-effort approach is the right call.

coderabbitai Bot reviewed Apr 9, 2026

View reviewed changes

wscurran added OpenShell Support for OpenShell, a safe, private runtime for autonomous AI agents fix labels Apr 9, 2026

ColinM-sys force-pushed the fix/1229-chown-after-restore branch from 1a18f62 to 58c8278 Compare April 10, 2026 01:18

coderabbitai Bot reviewed Apr 10, 2026

View reviewed changes

wscurran mentioned this pull request Apr 14, 2026

test(blueprint): normalize cross-platform test paths #1862

Closed

4 tasks

cv added the v0.0.16 Release target label Apr 14, 2026

wscurran mentioned this pull request Apr 15, 2026

fix(cli): close stale-lock cleanup race in acquireOnboardLock (#1281) #1656

Merged

5 tasks

wscurran added the status: rebase PR needs to be rebased against main before review can continue label Apr 15, 2026

ColinM-sys force-pushed the fix/1229-chown-after-restore branch from 58c8278 to 80f59f6 Compare April 15, 2026 02:23

ericksoa approved these changes Apr 15, 2026

View reviewed changes

ericksoa merged commit e085477 into NVIDIA:main Apr 15, 2026
1 check passed

ericksoa mentioned this pull request Apr 15, 2026

test(blueprint): add chown coverage and debug logging for snapshot restore #1899

Merged

4 tasks

wscurran mentioned this pull request Apr 15, 2026

docs(k8s): document evaluation-only patterns and production alternatives (#1442) #1676

Closed

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(blueprint): chown copied files to sandbox user after restore (#1229)#1667

fix(blueprint): chown copied files to sandbox user after restore (#1229)#1667
ericksoa merged 1 commit intoNVIDIA:mainfrom
ColinM-sys:fix/1229-chown-after-restore

ColinM-sys commented Apr 9, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 9, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Apr 9, 2026

Uh oh!

wscurran commented Apr 9, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

wscurran commented Apr 15, 2026

Uh oh!

ColinM-sys commented Apr 15, 2026

Uh oh!

ericksoa left a comment

Uh oh!

Uh oh!

ColinM-sys commented Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

ColinM-sys commented Apr 9, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

wscurran commented Apr 9, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

wscurran commented Apr 15, 2026

Uh oh!

ColinM-sys commented Apr 15, 2026

Uh oh!

ericksoa left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ColinM-sys commented Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ColinM-sys commented Apr 9, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 9, 2026 •

edited

Loading