composefs: Support transient /etc, transient root, and volatile /var #2201
composefs: Support transient /etc, transient root, and volatile /var #2201cgwalters wants to merge 6 commits into
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces support for transient and volatile configurations for the root filesystem, /etc, and /var when using the composefs backend. Key changes include the addition of a setup-root-conf.toml configuration file, a new bootc-early-overlay-relabel.service to handle SELinux labels on transient overlays, and updated initramfs logic to support these mount types. The PR also adds comprehensive documentation and new test suites for these features. Feedback suggests explicitly importing libc or using rustix constants in the generator to ensure better portability and clarity.
9ec1a58 to
73ae51e
Compare
|
I didn't exclude centos-9 here, which then got us into #1812 |
…eged processes The overlayfs merged view inherits its root permissions from the upperdir. When upper/ was created with 0700 (the same mode passed for work/), the merged / appeared as drwx------ to all non-root processes, causing dbus, systemd units that drop privileges, and anything using DAC to fail with EACCES immediately after switch-root. Fix: create upper/ with 0755 so the merged root is world-traversable. work/ remains 0700 — it is kernel-internal and never exposed in the merged view, so tighter permissions there are harmless. This mirrors what systemd does in volatile-root.c and nspawn-mount.c, and fixes the issue reported in composefs-rs#287. Assisted-by: OpenCode (claude-sonnet-4-6@default) Signed-off-by: Colin Walters <walters@verbum.org>
Image authors who ship /usr/lib/composefs/setup-root-conf.toml to configure composefs mount behaviour (e.g. transient /etc) previously had to add explicit --include flags to every dracut invocation in their Containerfile. Teach module-setup.sh to install the file automatically when present, mirroring what the composefs-rs dracut modules do. Use '[[ -e ]] && inst_simple' rather than inst_if_exists: the latter is not always available when dracut is invoked explicitly with --force in a Containerfile RUN layer (outside of kernel-install's dracut wrapper). Assisted-by: OpenCode (claude-sonnet-4-6@default) Signed-off-by: Colin Walters <walters@verbum.org>
overlay_transient() now returns a detached fsmount fd rather than immediately attaching it, letting the caller decide where to place the overlay. This is a correctness fix: on pre-6.15 kernels, the old code mounted the overlay then continued using the original composefs dirfd for subsequent submounts, which meant /etc and /var landed in the hidden lower layer rather than the visible merged view. The overlay source name now embeds the composefs digest as "transient:composefs=<hash>" so that composefs_booted() can extract the digest from the mount source after switch-root, the same way it does for the normal "composefs:<hash>" source. overlay_state() also loses its unused _mode parameter. Assisted-by: OpenCode (claude-sonnet-4-6@default) Signed-off-by: Colin Walters <walters@verbum.org>
When root.transient = true, bootc-root-setup wraps the composefs lower in an overlayfs whose source is "transient:composefs=<hash>" rather than "composefs:<hash>". Handle both prefixes uniformly so that composefs_booted() works correctly on transient root boots and soft-reboots are detected the same way in both cases. Assisted-by: OpenCode (Claude Sonnet 4.6) Signed-off-by: Colin Walters <walters@verbum.org>
…nux fix Transient overlays (/) inherit tmpfs_t from the upper dir's tmpfs via fs_use_trans at SELinux policy-load time. Add a generator-emitted oneshot unit, bootc-early-overlay-relabel.service, that runs 'bootc internals relabel-overlay-mountpoints' before sysinit.target to restore the correct label on each writable overlayfs mount point. Two detection paths, both needed because the generator runs before local-fs.target: - Root writability: inspect the mount source for the "transient:composefs=" prefix to detect a transient root overlay. - Subdir mounts (/etc): bootc-root-setup.service mounts these after the generator, so we read setup-root-conf.toml directly from the booted image to know whether /etc will be a transient overlay. The detection block runs before the OSTREE_BOOTED guard: native composefs boots do not write /run/ostree-booted, but still need the relabel unit. relabel_overlay_mountpoints() checks both OVERLAYFS_SUPER_MAGIC and !RDONLY to distinguish writable transient overlays from the read-only composefs root (both are overlayfs, only the former needs relabelling). Assisted-by: OpenCode (claude-sonnet-4-6@default) Signed-off-by: Colin Walters <walters@verbum.org>
Add TOML configuration (setup-root-conf.toml) for composefs mount behaviour: - [root] transient = true: wrap the composefs in a tmpfs overlay; all writes are discarded on reboot. - [etc] mount = transient|overlay|bind|none: control how /etc is mounted from the deployment state directory. - [var] mount = none|bind: control whether /var is bind-mounted from state. When mount = none, /var is left as an empty composefs directory. bootc-root-setup also detects the systemd.volatile=state kernel argument at boot time and automatically skips the /var state bind-mount when it is set, leaving /var empty for systemd-fstab-generator to mount a fresh tmpfs there at local-fs.target. This is the recommended way to get an ephemeral /var: it uses a plain tmpfs rather than overlayfs, which is compatible with tools like podman that use overlayfs under /var/lib/containers. Add inject-baseconfig CI helper, a test-baseconfigs CI job, and a 040-test-baseconfigs.nu integration test that boots each configuration in a VM and validates filesystem types, writability, SELinux labels, and podman graph driver compatibility. Assisted-by: OpenCode (claude-sonnet-4-6@default) Signed-off-by: Colin Walters <walters@verbum.org>
73ae51e to
188b055
Compare
| IFS=',' read -ra TOKENS <<< "${BASECONFIGS}" | ||
| for raw_token in "${TOKENS[@]}"; do | ||
| # Trim leading/trailing spaces | ||
| token="${raw_token#"${raw_token%%[![:space:]]*}"}" |
There was a problem hiding this comment.
I feel like we should probably use python scripts for stuff like this
There was a problem hiding this comment.
Yeah, or Rust (could do some xtask-style stuff here).
While working on the sealed images, I think having
/etctransient is definitely something we want to encourage. I hit composefs/composefs-rs#287 right away, and this fixes that.For other cases, we'll want to also support a transient
/and to do that is a lot more complicated because of SELinux - right now we need a custom early overlay service.For
/varI don't think we want to encourage use of overlayfs because it breaks things like podman/docker and any usage of overlayfs in general there. So we now document (and better support)systemd.volatile=statefor this.