fix(status): surface Telegram 409 conflicts by passing sandbox name via --name flag (Fixes #2018)#2034
Conversation
…ia --name flag (NVIDIA#2018) `checkMessagingBridgeHealth()` invoked `openshell sandbox exec` with the sandbox name as a positional argument. `openshell sandbox exec` requires `--name` / `-n` — the positional gets parsed as the first word of the command and fails with exit 127. The function's catch-all swallowed the error, so the "degraded" warning for Telegram 409 conflicts never appeared in `nemoclaw <sandbox> status`. Extract `checkMessagingBridgeHealth` into `src/lib/messaging-bridge-health.ts` so the argv shape is unit-testable (the IIFE-style `nemoclaw.ts` exports nothing). The new module: - Uses the correct `sandbox exec -n <name> -- sh -c <script>` form. - Uses `resolveOpenshell()` directly and returns an empty list if the binary is missing, rather than relying on the caller's cached helper. Unit tests pin the argv shape (the two regression cases fail on the pre-fix arguments) and cover the happy path, zero-count, non-numeric, throw, and non-Telegram channels. Note for reviewer: the same `sandbox exec <name> …` positional pattern appears at `src/lib/agent-onboard.ts:135,177` and `src/lib/sandbox-state.ts:388`. Those callsites are outside this issue's scope; flagging for a possible follow-up. Signed-off-by: sanketsh4h <[email protected]>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughExtracts Telegram bridge health-check into Changes
Sequence Diagram(s)sequenceDiagram
rect rgba(200,200,255,0.5)
participant Nemoclaw
end
rect rgba(200,255,200,0.5)
participant Resolver as resolveOpenshell
end
rect rgba(255,200,200,0.5)
participant Openshell as spawnSync / openshell
end
Nemoclaw->>Resolver: request openshell path
Resolver-->>Nemoclaw: binaryPath or null
alt binaryPath present and channels include "telegram"
Nemoclaw->>Openshell: spawnSync(['sandbox','exec','-n', name,'--','sh','-c', <tail/grep>], timeout:3s)
Openshell-->>Nemoclaw: { stdout, status, error }
Nemoclaw->>Nemoclaw: parse stdout -> integer count
alt count > 0
Nemoclaw-->>Nemoclaw: return [{ channel: "telegram", conflicts: count }]
else
Nemoclaw-->>Nemoclaw: return []
end
else
Nemoclaw-->>Nemoclaw: return []
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/nemoclaw.ts (1)
1092-1095:⚠️ Potential issue | 🟡 MinorUse the fixed
sandbox execargv shape inreadGatewayLogtoo.Now that the new probe can surface degraded Telegram bridges again, this helper becomes the next step in that status flow. It still calls
openshell sandbox execasexec <name> sh -c ..., so affected OpenShell builds will exit 127 here and silently suppress the Hermes gateway-log tail.Suggested fix
const result = spawnSync( getOpenshellBinary(), - ["sandbox", "exec", sandboxName, "sh", "-c", "tail -n 10 /tmp/gateway.log 2>/dev/null"], + [ + "sandbox", + "exec", + "-n", + sandboxName, + "--", + "sh", + "-c", + "tail -n 10 /tmp/gateway.log 2>/dev/null", + ], { encoding: "utf-8", timeout: 3000, stdio: ["ignore", "pipe", "pipe"] }, );🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/nemoclaw.ts` around lines 1092 - 1095, The spawnSync call in readGatewayLog uses the old argv order for openshell sandbox exec; change the argument vector to pass the sandbox name before the "--" separator so the command is executed correctly. Update the spawnSync invocation that constructs the args array (currently ["sandbox","exec", sandboxName, "sh","-c", "tail -n 10 /tmp/gateway.log 2>/dev/null"]) to use ["sandbox","exec", sandboxName, "--", "sh","-c", "tail -n 10 /tmp/gateway.log 2>/dev/null"] so openshell runs the shell command instead of returning 127.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/lib/messaging-bridge-health.ts`:
- Around line 40-47: The code parses spawnSync's stdout without checking for
spawn/timeout failures; update the block around spawnSync(binary, args, ...) to
first inspect the returned result (the result variable) for errors or
non-successful termination (check result.error, result.status !== 0, or
result.signal/timeout conditions) and bail out (return []) or handle the error
before attempting Number.parseInt on result.stdout; include stderr or
result.error in any logs to aid debugging and only parse stdout when
result.status === 0 and result.error is undefined.
---
Outside diff comments:
In `@src/nemoclaw.ts`:
- Around line 1092-1095: The spawnSync call in readGatewayLog uses the old argv
order for openshell sandbox exec; change the argument vector to pass the sandbox
name before the "--" separator so the command is executed correctly. Update the
spawnSync invocation that constructs the args array (currently
["sandbox","exec", sandboxName, "sh","-c", "tail -n 10 /tmp/gateway.log
2>/dev/null"]) to use ["sandbox","exec", sandboxName, "--", "sh","-c", "tail -n
10 /tmp/gateway.log 2>/dev/null"] so openshell runs the shell command instead of
returning 127.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 8ac7347a-8af4-47b2-b6f8-ddc698960641
📒 Files selected for processing (3)
src/lib/messaging-bridge-health.test.tssrc/lib/messaging-bridge-health.tssrc/nemoclaw.ts
…guard spawnSync errors - Apply the same `sandbox exec -n <name> -- sh -c …` fix to `readGatewayLog` in `src/nemoclaw.ts` (same positional-arg bug as `checkMessagingBridgeHealth`, spotted in review). - Check `result.error` and `result.status !== 0` before parsing `spawnSync` stdout in `messaging-bridge-health.ts`; an exec failure (exit 127, timeout SIGTERM) produced a non-zero status that the previous code did not guard against. - Add two new unit tests covering the non-zero exit and timeout/error cases (11 tests total, all pass). Signed-off-by: sanketsh4h <[email protected]>
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/nemoclaw.ts (1)
1092-1099:⚠️ Potential issue | 🟠 MajorGuard
readGatewayLogon spawn failure/non-zero exit before returning stdout.The current implementation returns stdout even when the
spawnSynccall fails. Node.jsspawnSyncdoes not throw on non-zero exit codes—it only throws on exceptions. The try-catch block will not catch exit failures, allowing misleading log output to be returned instead ofnull.Suggested fix
try { const result = spawnSync( getOpenshellBinary(), ["sandbox", "exec", "-n", sandboxName, "--", "sh", "-c", "tail -n 10 /tmp/gateway.log 2>/dev/null"], { encoding: "utf-8", timeout: 3000, stdio: ["ignore", "pipe", "pipe"] }, ); + if (result.error || result.status !== 0) return null; const output = (result.stdout || "").trim(); return output || null; } catch { return null; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/nemoclaw.ts` around lines 1092 - 1099, The readGatewayLog function currently returns result.stdout even when spawnSync failed or exited non-zero; update the code after calling spawnSync (and before using result.stdout) to check result.error (exception), result.status !== 0, or result.signal and treat those as failures by logging/ignoring and returning null; reference the spawnSync call and the local result variable (check result.error, result.status, result.stderr) and only return (result.stdout || "").trim() when the call succeeded (status === 0 and no error), otherwise return null.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@src/nemoclaw.ts`:
- Around line 1092-1099: The readGatewayLog function currently returns
result.stdout even when spawnSync failed or exited non-zero; update the code
after calling spawnSync (and before using result.stdout) to check result.error
(exception), result.status !== 0, or result.signal and treat those as failures
by logging/ignoring and returning null; reference the spawnSync call and the
local result variable (check result.error, result.status, result.stderr) and
only return (result.stdout || "").trim() when the call succeeded (status === 0
and no error), otherwise return null.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: acba7a42-a3a0-4055-9e39-83d15ef2b27f
📒 Files selected for processing (3)
src/lib/messaging-bridge-health.test.tssrc/lib/messaging-bridge-health.tssrc/nemoclaw.ts
spawnSync does not throw on non-zero exit; without checking result.error and result.status, stdout from a failed exec could be returned as log content. Match the guard already applied to checkMessagingBridgeHealth. Signed-off-by: sanketsh4h <[email protected]>
Introduce mockSpawn() to replace repeated full spawnSync result literals in the test suite and trim the module header comment. Behavior unchanged; all 11 tests still pass. Signed-off-by: sanketsh4h <[email protected]>
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/lib/messaging-bridge-health.ts (1)
7-10: Consider reusing a single bridge-health result type across modules.
BridgeConflictduplicates the shape ofMessagingBridgeHealthinsrc/lib/inventory-commands.ts. Consolidating to one shared type would reduce drift risk and keep contracts consistent.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/messaging-bridge-health.ts` around lines 7 - 10, The BridgeConflict interface duplicates the existing MessagingBridgeHealth shape; replace the local BridgeConflict definition with a reuse/import of the single shared type (MessagingBridgeHealth) and update any references to BridgeConflict to refer to MessagingBridgeHealth (or re-export MessagingBridgeHealth under the BridgeConflict name if needed for compatibility). Locate the declaration of BridgeConflict in this file and remove it, add an import statement for MessagingBridgeHealth, and adjust exported types/consumers accordingly so only the shared MessagingBridgeHealth type is used across modules.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/lib/messaging-bridge-health.ts`:
- Around line 7-10: The BridgeConflict interface duplicates the existing
MessagingBridgeHealth shape; replace the local BridgeConflict definition with a
reuse/import of the single shared type (MessagingBridgeHealth) and update any
references to BridgeConflict to refer to MessagingBridgeHealth (or re-export
MessagingBridgeHealth under the BridgeConflict name if needed for
compatibility). Locate the declaration of BridgeConflict in this file and remove
it, add an import statement for MessagingBridgeHealth, and adjust exported
types/consumers accordingly so only the shared MessagingBridgeHealth type is
used across modules.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 6fac6bf3-55fe-4153-99b5-c32064e79560
📒 Files selected for processing (2)
src/lib/messaging-bridge-health.test.tssrc/lib/messaging-bridge-health.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- src/lib/messaging-bridge-health.test.ts
…dgeConflict Drop the local BridgeConflict interface and import MessagingBridgeHealth from inventory-commands, which already defines the same shape and is the type the consumer expects. Eliminates drift risk between the two. Signed-off-by: sanketsh4h <[email protected]>
|
✨ Thanks for submitting this PR that proposes a fix to surface Telegram 409 conflicts by passing sandbox name via --name flag, which could help resolve issues with the Telegram integration. Possibly related open issues: |
|
Hi William 👋🏽. Thank you so much for looking at my PR! Please let me know if any improvements/edits are needed before it can be merged. |
|
Thanks @sanketsh4h for the contribution and for identifying the broken Telegram bridge-health exec path. We landed the maintained fix for #2018 in #2037, and current main now uses the named sandbox exec form for the status bridge-health probe and gateway-log tail with regression coverage. Closing this PR as superseded by #2037. |
Summary
checkMessagingBridgeHealth()invokedopenshell sandbox execwith the sandbox name as a positional argument.openshell sandbox execrequires--name/-n— the positional was parsed as the first word of the command and exited 127. The function's catch-all swallowed the error, so the "degraded" warning for Telegram 409 conflicts never appeared innemoclaw <sandbox> status.Related Issue
Fixes #2018
Changes
checkMessagingBridgeHealthfrom the IIFE-stylesrc/nemoclaw.tsinto a newsrc/lib/messaging-bridge-health.tsso the argv shape is unit-testable.openshell sandbox exec -n <name> -- sh -c <script>form.resolveOpenshell()directly and return an empty list if the binary is missing (matches the pattern insandbox-state.ts/sandbox-version.ts), instead of relying on the caller's cached helper that hard-exits.src/lib/messaging-bridge-health.test.tswith 9 unit tests: happy path, zero-count, non-numeric stdout,spawnSyncthrows, non-Telegram channels, null/undefined channels, missing binary, and two regression tests that pin the argv shape (both fail on the pre-fix arguments).Note for reviewer
The same
sandbox exec <name> …positional pattern appears in a few other places:src/lib/agent-onboard.ts:135,177— agent readiness probessrc/lib/sandbox-state.ts:388— post-restorechownduring rebuildThose are outside this issue's scope; flagging in case you want a follow-up.
Type of Change
Verification
npx prek run --all-filespassesnpm testpassesmake docsbuilds without warnings (doc changes only)AI Disclosure
Signed-off-by: sanketsh4h [email protected]
Summary by CodeRabbit
New Features
Refactor
Tests