Replace shell based linux sandbox setup with go code#107
Conversation
This provents errors, if the shell environment is very restricted. This mostly happens by accident if multicall binaries (coreutils, uutils/coreutils or busybox) lead to overblocking because some of the shared binaries are blocked and therefore block all of them.
|
yes, this fixes #90. |
ed949cc to
b796bd8
Compare
b796bd8 to
3bc7ea7
Compare
|
At least this needs the removal of the old code path to be merged. |
There was a problem hiding this comment.
5 issues found across 11 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="cmd/fence/main.go">
<violation number="1" location="cmd/fence/main.go:685">
P2: Reuse the already-resolved exec path instead of calling `exec.LookPath` a second time after Landlock is applied. The first call resolves the path before Landlock restrictions are active; the second call repeats the lookup under Landlock, where directory traversal is more constrained. For non-standard locations (e.g., `/tmp/fence/bin/shell` as mentioned in the comment), this could fail if `LookPath` needs to stat directories not covered by the read rules. Hoisting the resolution out of the Linux-only block avoids both the redundancy and the potential failure.</violation>
</file>
<file name="cmd/fence/linux_bootstrap.go">
<violation number="1" location="cmd/fence/linux_bootstrap.go:167">
P1: Socket paths are not passed to `ApplyLandlockFromConfigWithExec` — `nil` is used instead of the actual socket paths. After Landlock is applied, the bridge goroutines still need to `Dial` those Unix sockets for each new connection. If the sockets aren't under an already-allowed directory (like `/tmp`), Landlock will block those dials and break proxy connectivity.
The `socketPaths` from `startBridgesAndSetEnv` should be threaded through to `applyLandlock` and passed as the third argument.</violation>
<violation number="2" location="cmd/fence/linux_bootstrap.go:356">
P2: When one `io.Copy` direction finishes, the function returns and both connections are closed, potentially discarding in-flight data from the other direction. For a proxy bridge, this can truncate responses. Consider waiting for both directions (`<-done; <-done`) or using `TCPConn.CloseWrite()` to half-close so the other direction can drain.</violation>
</file>
<file name="cmd/fence/linux_bootstrap_test.go">
<violation number="1" location="cmd/fence/linux_bootstrap_test.go:60">
P3: Avoid hardcoded TCP ports in tests; they can be occupied on CI/dev machines and make the tests flaky. Allocate a free port dynamically (e.g., listen on "127.0.0.1:0" to get a port) and pass that port into the bridge and dial calls.</violation>
</file>
<file name="internal/sandbox/linux.go">
<violation number="1" location="internal/sandbox/linux.go:566">
P1: Both `shellPath` and `fenceExePath` are passed to `--ro-bind` without resolving symlinks via `resolvePathForMount()`. On usr-merged distros where `/bin` → `/usr/bin`, bwrap will fail because the destination path contains symlink components. The shell-based bootstrap avoids this by resolving paths and using fixed destinations under `/tmp/fence/bin/`.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review, or fix all with cubic.
jy-tan
left a comment
There was a problem hiding this comment.
thanks for this! i think it's directionally correct, just some comments
| } | ||
| } | ||
|
|
||
| func startBridgesAndSetEnv(ctx context.Context, opts bootstrapOptions) []string { |
There was a problem hiding this comment.
this function leaves out setting NO_PROXY / no_proxy, which is present in original buildLinuxBootstrapScript. this could affect loopback behavior.
There was a problem hiding this comment.
I took a closer look, and I believe NO_PROXY/no_proxy is actually being set in the Go code, but perhaps I am missing something?.
The Go bootstrap sets it here:
fence/cmd/fence/linux_bootstrap.go:155-162
if opts.httpSocket != "" || opts.socksSocket != "" {
if err := setEnvVars(envGroup{
keys: []string{"NO_PROXY", "no_proxy"},
value: "localhost,127.0.0.1",
}); err != nil {
return nil, fatalError(ExitWrapperSetupFailed, "failed to set no_proxy env vars: %v", err)
}
}This matches the original shell script's value:
fence/internal/sandbox/linux.go:695-700
export NO_PROXY=localhost,127.0.0.1
export no_proxy=localhost,127.0.0.1To me this looks like the Go code preserves the same NO_PROXY behavior as the original buildLinuxBootstrapScript.
When I set my agent on this, it also found GenerateProxyEnvVars in utils.go, which generates a more expansive value. As far as I can tell it is currently only used in the Mac Sandbox.
fence/internal/sandbox/utils.go:67-77
noProxy := strings.Join([]string{
"localhost",
"127.0.0.1",
"::1",
"*.local",
".local",
"169.254.0.0/16",
"10.0.0.0/8",
"172.16.0.0/12",
"192.168.0.0/16",
}, ",")This includes ::1 (IPv6 loopback), .local/*.local, and private network ranges. I'm unsure whether these additional entries are appropriate for the Linux bootstrap context though — inside the sandbox, the proxy bridges only listen on 127.0.0.1, so traffic to private networks would still need to traverse the proxy to reach the host. Exempting 10.0.0.0/8 etc. from the proxy could potentially cause connections to fail silently if a sandboxed process tries to reach a private address that isn't actually routable inside the sandbox. On the other hand, ::1 seems like it could be a reasonable addition if the sandbox has IPv6 loopback available.
|
also, tests like that use we should have another true e2e test similar, maybe something like |
… values in the sandbox
…ad of calling os.Exit() This should make it easier to test them.
…ariables share the same value
Now we always mount it in /tmp and execute it from there
This prevents errors, if the shell environment is very restricted. This
mostly happens by accident if multicall binaries (coreutils,
uutils/coreutils or busybox) lead to overblocking because some of the
shared binaries are blocked and therefore block all of them.
Closes #90 - or at least is supposed to. Didn't have time to check that yet.
Also I am not very confident this is the right way to code this yet, which is why I've left the old one in there for now to make it easy to compare and contrast.
Some things we should probably discuss:
I tried to minimize dependencies inside the sandbox, which is why
socatis no longer required in the sandbox. This means that its job is now done by goroutines, which I do not understand at all yet. :) Is this actually a good idea? Are there drawbacks to the way the goroutines handle the traffic forwarding?Code strucuture is... well. it is. :/
This should however suffice to get the conversation going.