fix: add rootless DinD daemon.json path fallback in preflight#2045
fix: add rootless DinD daemon.json path fallback in preflight#2045
Conversation
Squashed PR #2044 as base for socket proxy work. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Workspace no longer has direct access to the Docker socket. A tecnativa/docker-socket-proxy sidecar filters API calls — only container, image, network, volume, info, and version endpoints are allowed. Exec and build are explicitly denied. The proxy runs as nobody (UID 65534) with GID 1000 (matching fsGroup) so it can connect to the rootless DinD socket without running as root. The socket volume is mounted at a dedicated /docker-socket path with readOnly to avoid conflicts with HAProxy runtime files. Also extends the Docker readiness timeout from 60s to 90s to account for rootless DinD + proxy startup, updates the README architecture diagram, and adds security test coverage for the proxy hardening and socket isolation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughThe Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
The CI legacy-path guard blocks edits to test/*.test.js files. Move the new proxy assertions to a .ts file and restore the .js to its original content so the cumulative diff shows the .js as unchanged and the .ts as newly added. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 6
🧹 Nitpick comments (2)
test/security-configuration-hardening.test.js (1)
51-58: Cover the proxy UID/GID contract too.The proxy's ability to open the rootless socket depends on
runAsUser: 65534andrunAsGroup: 1000. If either regresses, this deployment can break while the hardening test still stays green.Suggested assertions
expect(proxySection).toMatch(/allowPrivilegeEscalation:\s*false/); expect(proxySection).toMatch(/capabilities:\s*[\r\n]+\s*drop:\s*[\r\n]+\s*-\s*ALL/); expect(proxySection).toMatch(/seccompProfile:\s*[\r\n]+\s*type:\s*RuntimeDefault/); expect(proxySection).toMatch(/runAsNonRoot:\s*true/); + expect(proxySection).toMatch(/runAsUser:\s*65534/); + expect(proxySection).toMatch(/runAsGroup:\s*1000/);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/security-configuration-hardening.test.js` around lines 51 - 58, Add assertions to the proxy-section hardening test to verify the UID/GID contract by checking that proxySection contains the expected numeric user and group; specifically, update the test that references proxySection in test/security-configuration-hardening.test.js to include two additional expectations that match runAsUser: 65534 and runAsGroup: 1000 (use regex patterns similar to the existing assertions to tolerate whitespace).k8s/nemoclaw-k8s.yaml (1)
213-223: Hardeninit-docker-configthe same way as the rest of the pod.This init container only writes one file into a mounted volume, so leaving it with the default writable root filesystem and default capabilities is looser than it needs to be.
Low-risk hardening
- name: init-docker-config image: busybox + securityContext: + allowPrivilegeEscalation: false + readOnlyRootFilesystem: true + capabilities: + drop: + - ALL command: - sh - -c🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@k8s/nemoclaw-k8s.yaml` around lines 213 - 223, The init container init-docker-config should be hardened: add a securityContext to the container (in the init container spec for name: init-docker-config) that enforces runAsNonRoot: true with a non-root runAsUser (e.g. 1000), set allowPrivilegeEscalation: false, drop all capabilities (capabilities: { drop: ["ALL"] }), enable a seccompProfile (type: RuntimeDefault) and set readOnlyRootFilesystem: true while keeping the /config/docker volumeMount writable so the single file write still works; this tightens privileges without changing the current behavior of writing daemon.json to the mounted volume.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@k8s/nemoclaw-k8s.yaml`:
- Around line 67-80: The docker-proxy sidecar (container name "docker-proxy",
image tecnativa/docker-socket-proxy:0.6.0) exposes port 2375 on the pod network,
making the filtered Docker API reachable from other pods; either remove the
ports: - containerPort: 2375 stanza to avoid advertising the TCP port or,
preferably, add a NetworkPolicy that only allows ingress to pods with this
sidecar from the specific "workspace" pod/label/namespace (deny all other pod
ingress), referencing the pod label used by the deployment; update the manifest
to include that NetworkPolicy (or tighten the container's listen address to
localhost via docker-socket-proxy config if available) so only the intended pod
can reach port 2375.
- Around line 213-220: readDockerDefaultCgroupnsMode() currently only reads
/etc/docker/daemon.json and misses rootless Docker installs; update that
function to attempt the rootless path
(/home/rootless/.config/docker/daemon.json) in addition to the system path,
falling back gracefully if one is missing, and return the parsed
default-cgroupns-mode when found (ensure you still parse JSON and handle
read/parse errors the same way as the existing /etc/docker handling). Use the
existing function name readDockerDefaultCgroupnsMode and keep the same return
shape/behavior so callers remain unchanged.
In `@k8s/README.md`:
- Around line 214-215: Update the startup timing sentence in k8s/README.md to
match the manifest's retry policy: replace "Usually resolves after 30-60
seconds" with a statement that the workspace may take up to 90 seconds to start
(45 retries × 2s), and keep the note that rootless DinD can take slightly longer
due to user namespace setup so users don't debug prematurely.
- Around line 1-3: Add the required SPDX license header as the very first line
of the file (e.g., an SPDX-License-Identifier line) and remove the emoji from
the experimental callout; locate the "NemoClaw on Kubernetes" title and the
following blockquote starting with "⚠️ Experimental" and replace that blockquote
text with the same content but without the emoji and with the SPDX header
inserted above the title.
- Around line 167-178: The architecture ASCII diagram contains "openshell CLI"
with incorrect casing; update that token to the documented product casing
"OpenShell CLI" wherever it appears in the diagram (look for the string
"openshell CLI" inside the k8s/README.md diagram block) to comply with the
naming guideline for OpenShell/OpenClaw/NemoClaw.
In `@test/security-configuration-hardening.test.js`:
- Around line 72-74: The test is too loose by only checking for any tcp:// host
in workspaceSection; update the assertion so it requires the exact local
endpoint tcp://localhost:2375 (reference the workspaceSection variable and the
expect(...).toMatch call) and keep the existing negative unix: check;
specifically replace the loose regex that matches tcp://... with one that
matches value:\s*tcp:\/\/localhost:2375 so the test fails if DOCKER_HOST is set
to any other TCP target.
---
Nitpick comments:
In `@k8s/nemoclaw-k8s.yaml`:
- Around line 213-223: The init container init-docker-config should be hardened:
add a securityContext to the container (in the init container spec for name:
init-docker-config) that enforces runAsNonRoot: true with a non-root runAsUser
(e.g. 1000), set allowPrivilegeEscalation: false, drop all capabilities
(capabilities: { drop: ["ALL"] }), enable a seccompProfile (type:
RuntimeDefault) and set readOnlyRootFilesystem: true while keeping the
/config/docker volumeMount writable so the single file write still works; this
tightens privileges without changing the current behavior of writing daemon.json
to the mounted volume.
In `@test/security-configuration-hardening.test.js`:
- Around line 51-58: Add assertions to the proxy-section hardening test to
verify the UID/GID contract by checking that proxySection contains the expected
numeric user and group; specifically, update the test that references
proxySection in test/security-configuration-hardening.test.js to include two
additional expectations that match runAsUser: 65534 and runAsGroup: 1000 (use
regex patterns similar to the existing assertions to tolerate whitespace).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: c6c05082-8946-4683-8d64-93feb32feb65
📒 Files selected for processing (4)
k8s/README.mdk8s/nemoclaw-k8s.yamlsrc/lib/onboard.tstest/security-configuration-hardening.test.js
There was a problem hiding this comment.
🧹 Nitpick comments (1)
test/security-configuration-hardening.test.ts (1)
14-16: Harden workspace section extraction regex to avoid nested-match truncation.Line 15 currently stops on any
- name:line, including nested entries (for example env vars), which can truncateworkspaceSectionunexpectedly as the manifest evolves.♻️ Proposed fix
- /- name: workspace[\s\S]*?(?=\n\s*-\s*name: |\n\s*initContainers:|\n\s*volumes:|$)/, + /- name: workspace[\s\S]*?(?=\n\s{4}-\s*name: |\n\s*initContainers:|\n\s*volumes:|$)/,🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/security-configuration-hardening.test.ts` around lines 14 - 16, The workspace extraction regex can prematurely stop on nested "- name:" lines; update the pattern used for workspaceMatch to capture the leading indentation and use that indentation in the lookahead so only same-level siblings terminate the match (e.g. change the regex to capture an indentation group like /(^\s*)- name: workspace[\s\S]*?(?=\n\1-\s*name: |\n\1initContainers:|\n\1volumes:|$)/m), then rebuild workspaceSection from workspaceMatch as before (referencing the workspaceMatch and workspaceSection variables).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@test/security-configuration-hardening.test.ts`:
- Around line 14-16: The workspace extraction regex can prematurely stop on
nested "- name:" lines; update the pattern used for workspaceMatch to capture
the leading indentation and use that indentation in the lookahead so only
same-level siblings terminate the match (e.g. change the regex to capture an
indentation group like /(^\s*)- name: workspace[\s\S]*?(?=\n\1-\s*name:
|\n\1initContainers:|\n\1volumes:|$)/m), then rebuild workspaceSection from
workspaceMatch as before (referencing the workspaceMatch and workspaceSection
variables).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: d11e9a50-ff7f-4897-96f0-8adfcb32318a
📒 Files selected for processing (1)
test/security-configuration-hardening.test.ts
The TypeScript strict null checks flagged `workspaceMatch` and `proxyMatch` as possibly null despite the preceding `toBeNull()` guard. Add `!` post-fix assertions since Vitest's `expect().not.toBeNull()` narrows at runtime but not at the type level. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add readOnlyRootFilesystem to docker-proxy and init-docker-config containers - Remove containerPort declaration from docker-proxy to avoid advertising 2375 on the pod IP - Support rootless Docker daemon.json path in readDockerDefaultCgroupnsMode() - Add SPDX header and remove emoji from k8s README - Fix OpenShell CLI casing in architecture diagram - Update startup timing to match 90s retry policy (45 retries x 2s) - Tighten test assertion to require exact tcp://localhost:2375 endpoint Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
k8s/nemoclaw-k8s.yaml (1)
65-122:⚠️ Potential issue | 🟠 Major
docker-proxyis still reachable from other pods unless ingress is explicitly restricted.Removing
containerPorthides advertisement, but it does not block traffic to Pod IP:2375. Please add aNetworkPolicy(deny-all ingress, or allow only explicitly needed peers) forapp: nemoclaw.🔐 Suggested hardening (deny all ingress to this pod)
+--- +apiVersion: networking.k8s.io/v1 +kind: NetworkPolicy +metadata: + name: nemoclaw-deny-ingress + namespace: nemoclaw +spec: + podSelector: + matchLabels: + app: nemoclaw + policyTypes: + - Ingress + ingress: []🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@k8s/nemoclaw-k8s.yaml` around lines 65 - 122, The docker-proxy sidecar (container name "docker-proxy") remains reachable on PodIP:2375 from other pods; add a Kubernetes NetworkPolicy that selects pods with label "app: nemoclaw" and denies all ingress by default (or alternatively, explicitly allow only required peers/namespaces) to prevent pod-to-pod access to the proxy; implement this by creating a NetworkPolicy resource that either has an empty ingress list (deny all) or a minimal ingress rule permitting known peers, and ensure its podSelector matches app: nemoclaw so the docker-proxy pod is protected.
🧹 Nitpick comments (1)
test/security-configuration-hardening.test.ts (1)
73-75: TightenDOCKER_HOSTassertion to enforce exact endpoint value.Current regex matches substrings; anchoring to the full value line avoids accidental passes.
🎯 Suggested fix
- expect(workspaceSection).toMatch( - /DOCKER_HOST[\s\S]*?value:\s*tcp:\/\/localhost:2375/, - ); + expect(workspaceSection).toMatch( + /DOCKER_HOST[\s\S]*?value:\s*["']?tcp:\/\/localhost:2375["']?(?:\r?\n|$)/, + );🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/security-configuration-hardening.test.ts` around lines 73 - 75, The current expectation in security-configuration-hardening.test.ts uses a loose regex for workspaceSection and DOCKER_HOST that can match substrings; tighten it so the value line is matched exactly by updating the regex in the expect(workspaceSection).toMatch(...) call to anchor the value line (use multiline mode) e.g. ensure the pattern includes ^\s*value:\s*tcp:\/\/localhost:2375\s*$ with the m flag and is applied within the DOCKER_HOST block (reference workspaceSection and the DOCKER_HOST value line) so only an exact endpoint value passes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/security-configuration-hardening.test.ts`:
- Around line 15-16: The workspace-section regex is too permissive: update the
lookahead in the regex literal currently matching "/- name:
workspace[\s\S]*?(?=\n\s*-\s*name: |\n\s*initContainers:|\n\s*volumes:|$)/" to
use the container-level indentation boundary (\s{4}) so nested "- name:" entries
don't prematurely stop the match; e.g., change the lookahead to
"(?=\n\s{4}-\s*name: |\n\s{4}initContainers:|\n\s{4}volumes:|$)". Ensure the
change is applied to the same regex used in the test (the /- name:
workspace[...].../ literal).
---
Duplicate comments:
In `@k8s/nemoclaw-k8s.yaml`:
- Around line 65-122: The docker-proxy sidecar (container name "docker-proxy")
remains reachable on PodIP:2375 from other pods; add a Kubernetes NetworkPolicy
that selects pods with label "app: nemoclaw" and denies all ingress by default
(or alternatively, explicitly allow only required peers/namespaces) to prevent
pod-to-pod access to the proxy; implement this by creating a NetworkPolicy
resource that either has an empty ingress list (deny all) or a minimal ingress
rule permitting known peers, and ensure its podSelector matches app: nemoclaw so
the docker-proxy pod is protected.
---
Nitpick comments:
In `@test/security-configuration-hardening.test.ts`:
- Around line 73-75: The current expectation in
security-configuration-hardening.test.ts uses a loose regex for workspaceSection
and DOCKER_HOST that can match substrings; tighten it so the value line is
matched exactly by updating the regex in the
expect(workspaceSection).toMatch(...) call to anchor the value line (use
multiline mode) e.g. ensure the pattern includes
^\s*value:\s*tcp:\/\/localhost:2375\s*$ with the m flag and is applied within
the DOCKER_HOST block (reference workspaceSection and the DOCKER_HOST value
line) so only an exact endpoint value passes.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 3b315c59-2b85-4560-b240-6ddf00c5cb14
📒 Files selected for processing (4)
k8s/README.mdk8s/nemoclaw-k8s.yamlsrc/lib/preflight.tstest/security-configuration-hardening.test.ts
✅ Files skipped from review due to trivial changes (1)
- src/lib/preflight.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- k8s/README.md
Take main's String() wrapper over PR's .toString() — both fix the same stdout coercion and main's landed first.
# Conflicts: # k8s/nemoclaw-k8s.yaml
…tion
Use \s{4} instead of \s* in the workspace section lookahead to avoid
matching nested `- name:` entries, consistent with the proxy test regex.
Resolves CodeRabbit review feedback.
Set BIND_CONFIG=127.0.0.1:2375 so the proxy is unreachable from other pods in the cluster, closing off the last network path to the Docker API.
Keep k8s/README.md and k8s/nemoclaw-k8s.yaml from this branch (the docker socket proxy changes). Accept all other deletions from #2107 (installer-hash-check workflow, check-installer-hash.sh, AGENTS.md k8s row, .test.js file).
The test asserted against k8s/nemoclaw-k8s.yaml which was removed in #2107. The remaining change is the rootless DinD daemon.json path fallback in preflight.ts.
Summary
After removing the experimental k8s manifests in #2107, this PR retains the one non-k8s change from the original docker-socket-proxy work:
readDockerDefaultCgroupnsMode()inpreflight.tsnow checks both the standard/etc/docker/daemon.jsonand the rootless DinD path/home/rootless/.config/docker/daemon.jsonwhen reading the default cgroupns mode.Previously only
/etc/docker/daemon.jsonwas checked, so rootless DinD setups would always report"unknown"for cgroupns mode.Changes
src/lib/preflight.ts:readDockerDefaultCgroupnsMode()iterates over both daemon.json paths, returning the first valid mode found.Test plan
npm run typecheck:clipassesnpm testpassesSummary by CodeRabbit