fix(status): surface Telegram 409 conflicts by passing sandbox name via --name flag (Fixes #2018) by sanketsh4h · Pull Request #2034 · NVIDIA/NemoClaw

sanketsh4h · 2026-04-17T19:46:16Z

Summary

checkMessagingBridgeHealth() invoked openshell sandbox exec with the sandbox name as a positional argument. openshell sandbox exec requires --name/-n — the positional was parsed as the first word of the command and exited 127. The function's catch-all swallowed the error, so the "degraded" warning for Telegram 409 conflicts never appeared in nemoclaw <sandbox> status.

Related Issue

Fixes #2018

Changes

Extract checkMessagingBridgeHealth from the IIFE-style src/nemoclaw.ts into a new src/lib/messaging-bridge-health.ts so the argv shape is unit-testable.
Use the correct openshell sandbox exec -n <name> -- sh -c <script> form.
Use resolveOpenshell() directly and return an empty list if the binary is missing (matches the pattern in sandbox-state.ts / sandbox-version.ts), instead of relying on the caller's cached helper that hard-exits.
Add src/lib/messaging-bridge-health.test.ts with 9 unit tests: happy path, zero-count, non-numeric stdout, spawnSync throws, non-Telegram channels, null/undefined channels, missing binary, and two regression tests that pin the argv shape (both fail on the pre-fix arguments).

Note for reviewer

The same sandbox exec <name> … positional pattern appears in a few other places:

src/lib/agent-onboard.ts:135,177 — agent readiness probes
src/lib/sandbox-state.ts:388 — post-restore chown during rebuild

Those are outside this issue's scope; flagging in case you want a follow-up.

Type of Change

Code change (feature, bug fix, or refactor)
Code change with doc updates
Doc only (prose changes, no code sample modifications)
Doc only (includes code sample changes)

Verification

npx prek run --all-files passes
npm test passes
Tests added or updated for new or changed behavior
No secrets, API keys, or credentials committed
Docs updated for user-facing behavior changes
make docs builds without warnings (doc changes only)
Doc pages follow the style guide (doc changes only)
New doc pages include SPDX header and frontmatter (new pages only)

AI Disclosure

AI-assisted — tool:

Signed-off-by: sanketsh4h [email protected]

Summary by CodeRabbit

New Features
- Adds a messaging-bridge health check that reports Telegram conflict counts when present.
Refactor
- Moves bridge health logic into a dedicated module and improves reliability, error handling, and sandboxed invocation.
Tests
- Adds comprehensive tests covering success, failure, parsing, argument construction, and edge cases.

…ia --name flag (NVIDIA#2018) `checkMessagingBridgeHealth()` invoked `openshell sandbox exec` with the sandbox name as a positional argument. `openshell sandbox exec` requires `--name` / `-n` — the positional gets parsed as the first word of the command and fails with exit 127. The function's catch-all swallowed the error, so the "degraded" warning for Telegram 409 conflicts never appeared in `nemoclaw <sandbox> status`. Extract `checkMessagingBridgeHealth` into `src/lib/messaging-bridge-health.ts` so the argv shape is unit-testable (the IIFE-style `nemoclaw.ts` exports nothing). The new module: - Uses the correct `sandbox exec -n <name> -- sh -c <script>` form. - Uses `resolveOpenshell()` directly and returns an empty list if the binary is missing, rather than relying on the caller's cached helper. Unit tests pin the argv shape (the two regression cases fail on the pre-fix arguments) and cover the happy path, zero-count, non-numeric, throw, and non-Telegram channels. Note for reviewer: the same `sandbox exec <name> …` positional pattern appears at `src/lib/agent-onboard.ts:135,177` and `src/lib/sandbox-state.ts:388`. Those callsites are outside this issue's scope; flagging for a possible follow-up. Signed-off-by: sanketsh4h <[email protected]>

coderabbitai · 2026-04-17T19:46:32Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: a8622fc6-4a61-40e8-93e4-43bf10b1bf0c

📥 Commits

Reviewing files that changed from the base of the PR and between 8b754e1 and 3aeb8e6.

📒 Files selected for processing (1)

src/lib/messaging-bridge-health.ts

📝 Walkthrough

Walkthrough

Extracts Telegram bridge health-check into src/lib/messaging-bridge-health.ts, adds Vitest tests, and updates src/nemoclaw.ts to import the new module and adjust openshell sandbox exec argument ordering and error handling. The new function runs an openshell probe, parses stdout, and returns conflict counts for Telegram.

Changes

Cohort / File(s)	Summary
New Health Module `src/lib/messaging-bridge-health.ts`	Adds `checkMessagingBridgeHealth(sandboxName, channels)` which resolves `openshell`, runs `spawnSync(['sandbox','exec','-n', name, '--', 'sh','-c', <tail/grep>], { timeout: 3000, encoding: 'utf8' })`, parses stdout as integer, and returns `[{ channel: "telegram", conflicts }]` or `[]` on failures/zero/non-applicable channels.
Module Test Suite `src/lib/messaging-bridge-health.test.ts`	New Vitest suite mocking `resolveOpenshell` and `child_process.spawnSync`. Covers channel filtering (`null`/`undefined`/missing `"telegram"`), missing binary, parse failures/zero, thrown errors/non-zero status/error-shaped results, success path, and regression asserting `-n` and `--` placement in argv.
Caller / Exec arg & error handling `src/nemoclaw.ts`	Replaces inline health-check with imported `checkMessagingBridgeHealth`, updates `readGatewayLog` to call `openshell sandbox exec -n <name> -- sh -c ...`, and returns early when `result.error` is present or exit status is non-zero.

Sequence Diagram(s)

sequenceDiagram
    rect rgba(200,200,255,0.5)
    participant Nemoclaw
    end
    rect rgba(200,255,200,0.5)
    participant Resolver as resolveOpenshell
    end
    rect rgba(255,200,200,0.5)
    participant Openshell as spawnSync / openshell
    end
    Nemoclaw->>Resolver: request openshell path
    Resolver-->>Nemoclaw: binaryPath or null
    alt binaryPath present and channels include "telegram"
        Nemoclaw->>Openshell: spawnSync(['sandbox','exec','-n', name,'--','sh','-c', <tail/grep>], timeout:3s)
        Openshell-->>Nemoclaw: { stdout, status, error }
        Nemoclaw->>Nemoclaw: parse stdout -> integer count
        alt count > 0
            Nemoclaw-->>Nemoclaw: return [{ channel: "telegram", conflicts: count }]
        else
            Nemoclaw-->>Nemoclaw: return []
        end
    else
        Nemoclaw-->>Nemoclaw: return []
    end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐇 I hopped through logs and bashy shells,
Counted conflicts in their tiny cells.
I fetched a path, then ran a probe,
Returned a number from the lobe.
🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately reflects the primary fix: correcting the openshell sandbox exec invocation by using the --name flag for sandbox name passing, which addresses the root cause of Telegram 409 conflicts not surfacing in status output.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/nemoclaw.ts (1)

1092-1095: ⚠️ Potential issue | 🟡 Minor

Use the fixed sandbox exec argv shape in readGatewayLog too.

Now that the new probe can surface degraded Telegram bridges again, this helper becomes the next step in that status flow. It still calls openshell sandbox exec as exec <name> sh -c ..., so affected OpenShell builds will exit 127 here and silently suppress the Hermes gateway-log tail.

Suggested fix

     const result = spawnSync(
       getOpenshellBinary(),
-      ["sandbox", "exec", sandboxName, "sh", "-c", "tail -n 10 /tmp/gateway.log 2>/dev/null"],
+      [
+        "sandbox",
+        "exec",
+        "-n",
+        sandboxName,
+        "--",
+        "sh",
+        "-c",
+        "tail -n 10 /tmp/gateway.log 2>/dev/null",
+      ],
       { encoding: "utf-8", timeout: 3000, stdio: ["ignore", "pipe", "pipe"] },
     );

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/nemoclaw.ts` around lines 1092 - 1095, The spawnSync call in
readGatewayLog uses the old argv order for openshell sandbox exec; change the
argument vector to pass the sandbox name before the "--" separator so the
command is executed correctly. Update the spawnSync invocation that constructs
the args array (currently ["sandbox","exec", sandboxName, "sh","-c", "tail -n 10
/tmp/gateway.log 2>/dev/null"]) to use ["sandbox","exec", sandboxName, "--",
"sh","-c", "tail -n 10 /tmp/gateway.log 2>/dev/null"] so openshell runs the
shell command instead of returning 127.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/lib/messaging-bridge-health.ts`:
- Around line 40-47: The code parses spawnSync's stdout without checking for
spawn/timeout failures; update the block around spawnSync(binary, args, ...) to
first inspect the returned result (the result variable) for errors or
non-successful termination (check result.error, result.status !== 0, or
result.signal/timeout conditions) and bail out (return []) or handle the error
before attempting Number.parseInt on result.stdout; include stderr or
result.error in any logs to aid debugging and only parse stdout when
result.status === 0 and result.error is undefined.

---

Outside diff comments:
In `@src/nemoclaw.ts`:
- Around line 1092-1095: The spawnSync call in readGatewayLog uses the old argv
order for openshell sandbox exec; change the argument vector to pass the sandbox
name before the "--" separator so the command is executed correctly. Update the
spawnSync invocation that constructs the args array (currently
["sandbox","exec", sandboxName, "sh","-c", "tail -n 10 /tmp/gateway.log
2>/dev/null"]) to use ["sandbox","exec", sandboxName, "--", "sh","-c", "tail -n
10 /tmp/gateway.log 2>/dev/null"] so openshell runs the shell command instead of
returning 127.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 8ac7347a-8af4-47b2-b6f8-ddc698960641

📥 Commits

Reviewing files that changed from the base of the PR and between 9222cae and f17bd33.

📒 Files selected for processing (3)

src/lib/messaging-bridge-health.test.ts
src/lib/messaging-bridge-health.ts
src/nemoclaw.ts

…guard spawnSync errors - Apply the same `sandbox exec -n <name> -- sh -c …` fix to `readGatewayLog` in `src/nemoclaw.ts` (same positional-arg bug as `checkMessagingBridgeHealth`, spotted in review). - Check `result.error` and `result.status !== 0` before parsing `spawnSync` stdout in `messaging-bridge-health.ts`; an exec failure (exit 127, timeout SIGTERM) produced a non-zero status that the previous code did not guard against. - Add two new unit tests covering the non-zero exit and timeout/error cases (11 tests total, all pass). Signed-off-by: sanketsh4h <[email protected]>

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/nemoclaw.ts (1)

1092-1099: ⚠️ Potential issue | 🟠 Major

Guard readGatewayLog on spawn failure/non-zero exit before returning stdout.

The current implementation returns stdout even when the spawnSync call fails. Node.js spawnSync does not throw on non-zero exit codes—it only throws on exceptions. The try-catch block will not catch exit failures, allowing misleading log output to be returned instead of null.

Suggested fix

   try {
     const result = spawnSync(
       getOpenshellBinary(),
       ["sandbox", "exec", "-n", sandboxName, "--", "sh", "-c", "tail -n 10 /tmp/gateway.log 2>/dev/null"],
       { encoding: "utf-8", timeout: 3000, stdio: ["ignore", "pipe", "pipe"] },
     );
+    if (result.error || result.status !== 0) return null;
     const output = (result.stdout || "").trim();
     return output || null;
   } catch {
     return null;
   }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/nemoclaw.ts` around lines 1092 - 1099, The readGatewayLog function
currently returns result.stdout even when spawnSync failed or exited non-zero;
update the code after calling spawnSync (and before using result.stdout) to
check result.error (exception), result.status !== 0, or result.signal and treat
those as failures by logging/ignoring and returning null; reference the
spawnSync call and the local result variable (check result.error, result.status,
result.stderr) and only return (result.stdout || "").trim() when the call
succeeded (status === 0 and no error), otherwise return null.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@src/nemoclaw.ts`:
- Around line 1092-1099: The readGatewayLog function currently returns
result.stdout even when spawnSync failed or exited non-zero; update the code
after calling spawnSync (and before using result.stdout) to check result.error
(exception), result.status !== 0, or result.signal and treat those as failures
by logging/ignoring and returning null; reference the spawnSync call and the
local result variable (check result.error, result.status, result.stderr) and
only return (result.stdout || "").trim() when the call succeeded (status === 0
and no error), otherwise return null.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: acba7a42-a3a0-4055-9e39-83d15ef2b27f

📥 Commits

Reviewing files that changed from the base of the PR and between f17bd33 and 202b1d1.

📒 Files selected for processing (3)

src/lib/messaging-bridge-health.test.ts
src/lib/messaging-bridge-health.ts
src/nemoclaw.ts

spawnSync does not throw on non-zero exit; without checking result.error and result.status, stdout from a failed exec could be returned as log content. Match the guard already applied to checkMessagingBridgeHealth. Signed-off-by: sanketsh4h <[email protected]>

Introduce mockSpawn() to replace repeated full spawnSync result literals in the test suite and trim the module header comment. Behavior unchanged; all 11 tests still pass. Signed-off-by: sanketsh4h <[email protected]>

coderabbitai

🧹 Nitpick comments (1)

src/lib/messaging-bridge-health.ts (1)

7-10: Consider reusing a single bridge-health result type across modules.

BridgeConflict duplicates the shape of MessagingBridgeHealth in src/lib/inventory-commands.ts. Consolidating to one shared type would reduce drift risk and keep contracts consistent.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/lib/messaging-bridge-health.ts` around lines 7 - 10, The BridgeConflict
interface duplicates the existing MessagingBridgeHealth shape; replace the local
BridgeConflict definition with a reuse/import of the single shared type
(MessagingBridgeHealth) and update any references to BridgeConflict to refer to
MessagingBridgeHealth (or re-export MessagingBridgeHealth under the
BridgeConflict name if needed for compatibility). Locate the declaration of
BridgeConflict in this file and remove it, add an import statement for
MessagingBridgeHealth, and adjust exported types/consumers accordingly so only
the shared MessagingBridgeHealth type is used across modules.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/lib/messaging-bridge-health.ts`:
- Around line 7-10: The BridgeConflict interface duplicates the existing
MessagingBridgeHealth shape; replace the local BridgeConflict definition with a
reuse/import of the single shared type (MessagingBridgeHealth) and update any
references to BridgeConflict to refer to MessagingBridgeHealth (or re-export
MessagingBridgeHealth under the BridgeConflict name if needed for
compatibility). Locate the declaration of BridgeConflict in this file and remove
it, add an import statement for MessagingBridgeHealth, and adjust exported
types/consumers accordingly so only the shared MessagingBridgeHealth type is
used across modules.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 6fac6bf3-55fe-4153-99b5-c32064e79560

📥 Commits

Reviewing files that changed from the base of the PR and between 9088b02 and 8b754e1.

📒 Files selected for processing (2)

src/lib/messaging-bridge-health.test.ts
src/lib/messaging-bridge-health.ts

🚧 Files skipped from review as they are similar to previous changes (1)

src/lib/messaging-bridge-health.test.ts

…dgeConflict Drop the local BridgeConflict interface and import MessagingBridgeHealth from inventory-commands, which already defines the same shape and is the type the consumer expects. Eliminates drift risk between the two. Signed-off-by: sanketsh4h <[email protected]>

wscurran · 2026-04-20T04:04:41Z

✨ Thanks for submitting this PR that proposes a fix to surface Telegram 409 conflicts by passing sandbox name via --name flag, which could help resolve issues with the Telegram integration.

Possibly related open issues:

#2018 [NemoClaw][Brev][Agent] checkMessagingBridgeHealth() passes sandbox name as positional arg to openshell exec, silently failing

sanketsh4h · 2026-04-20T20:06:02Z

Hi William 👋🏽. Thank you so much for looking at my PR! Please let me know if any improvements/edits are needed before it can be merged.

ericksoa · 2026-05-05T05:17:59Z

Thanks @sanketsh4h for the contribution and for identifying the broken Telegram bridge-health exec path. We landed the maintained fix for #2018 in #2037, and current main now uses the named sandbox exec form for the status bridge-health probe and gateway-log tail with regression coverage. Closing this PR as superseded by #2037.

coderabbitai Bot reviewed Apr 17, 2026

View reviewed changes

Comment thread src/lib/messaging-bridge-health.ts

coderabbitai Bot reviewed Apr 17, 2026

View reviewed changes

sanketsh4h added 2 commits April 17, 2026 13:09

refactor(test): collapse spawnSync mocks into a helper

8b754e1

Introduce mockSpawn() to replace repeated full spawnSync result literals in the test suite and trim the module header comment. Behavior unchanged; all 11 tests still pass. Signed-off-by: sanketsh4h <[email protected]>

coderabbitai Bot reviewed Apr 18, 2026

View reviewed changes

wscurran added NemoClaw CLI Use this label to identify issues with the NemoClaw command-line interface (CLI). Integration: Telegram Use this label to identify Telegram bot integration issues with NemoClaw. fix labels Apr 20, 2026

wscurran mentioned this pull request Apr 28, 2026

[Brev][Agent] checkMessagingBridgeHealth() passes sandbox name as positional arg to openshell exec, silently failing #2018

Closed

ericksoa added the status: superseded Another PR has resolved the intent of this PR. label May 5, 2026

ericksoa closed this May 5, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(status): surface Telegram 409 conflicts by passing sandbox name via --name flag (Fixes #2018)#2034

fix(status): surface Telegram 409 conflicts by passing sandbox name via --name flag (Fixes #2018)#2034
sanketsh4h wants to merge 5 commits intoNVIDIA:mainfrom
sanketsh4h:fix/2018-messaging-bridge-health-exec-args

sanketsh4h commented Apr 17, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 17, 2026 •

edited

Loading

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot left a comment

Uh oh!

wscurran commented Apr 20, 2026

Uh oh!

sanketsh4h commented Apr 20, 2026

Uh oh!

ericksoa commented May 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

sanketsh4h commented Apr 17, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Related Issue

Changes

Note for reviewer

Type of Change

Verification

AI Disclosure

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

wscurran commented Apr 20, 2026

Uh oh!

sanketsh4h commented Apr 20, 2026

Uh oh!

ericksoa commented May 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

sanketsh4h commented Apr 17, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 17, 2026 •

edited

Loading