Skip to content
Draft
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions lib/functions/main/rootfs-image.sh
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,11 @@ function build_rootfs_and_image() {
# get a basic rootfs, either from cache or from scratch
get_or_create_rootfs_cache_chroot_sdcard # only occurrence of this; has its own logging sections

# Cache-hit path also benefits — the extracted rootfs has libc/ld-linux, so kernel
# binfmt_elf can run 32-bit ARM ELF natively. Idempotent on cache-miss path where
# create_new_rootfs_cache_via_debootstrap already activated this.
_native_armhf_setup_binfmt_elf || true

# deploy the qemu binary, no matter where the rootfs came from (built or cached)
LOG_SECTION="deploy_qemu_binary_to_chroot_image" do_with_logging deploy_qemu_binary_to_chroot "${SDCARD}" "image" # undeployed at end of this function

Expand Down
237 changes: 232 additions & 5 deletions lib/functions/rootfs/qemu-static.sh
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,13 @@ function deploy_qemu_binary_to_chroot() {
return 0
fi

# Native armhf path is active: kernel binfmt_elf executes 32-bit ARM ELF via
# CONFIG_COMPAT, no qemu-arm-static needed inside the chroot.
if [[ "${ARMBIAN_NATIVE_ARMHF_VIA_BINFMT_ELF:-no}" == "yes" ]]; then
display_alert "Native armhf via binfmt_elf" "skipping qemu binary deployment during ${caller}" "info"
return 0
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve target qemu binaries on native armhf

When native armhf is active, this returns before the existing preservation logic can move a target-owned /usr/bin/${QEMU_BINARY} aside. Later undeploy_qemu_binary_from_chroot still treats any such file as the host copy and removes it, so an armhf image/rootfs that intentionally installs qemu-user-static (for example via package lists or customization) loses that package's binary even though this deploy path never copied it.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve target qemu binaries on native armhf

When native armhf is active, this returns before the existing preservation logic can move a target-owned /usr/bin/${QEMU_BINARY} aside. Later undeploy_qemu_binary_from_chroot still treats any such file as the host copy and removes it, so an armhf image/rootfs that intentionally installs qemu-user-static during image customization loses that package's binary even though this deploy path never copied it.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve existing target qemu binary on native path

When native armhf mode is active, this early return skips the preservation step for an existing ${chroot_target}/usr/bin/qemu-arm-static. The later undeploy_qemu_binary_from_chroot still removes any file at that path, so a cache-hit/image that legitimately contains qemu-user-static will have its package-owned binary deleted instead of restored. The native fast path still needs to record that no host binary was deployed, or make undeploy a no-op for files it did not copy.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Do not undeploy qemu when deployment was skipped

When native armhf is active on the cache-hit/image path, this early return skips the backup step that normally protects an already-installed /usr/bin/${QEMU_BINARY} in the rootfs. The matching undeploy still removes any existing file if it is present, so images whose cached rootfs legitimately contains qemu-user-static/qemu-user-binfmt lose that package-owned binary even though this build never deployed it.

Useful? React with 👍 / 👎.

fi

# Source: try the historical name first (qemu-<arch>-static), fall back
# to the bare name shipped by Ubuntu resolute's qemu-user-binfmt package
# (e.g. /usr/bin/qemu-aarch64).
Expand Down Expand Up @@ -76,8 +83,19 @@ function undeploy_qemu_binary_from_chroot() {
declare dst_target_bkp="${dst_target}.armbian.orig"
declare dst_target_alt_bkp="${dst_target_alt}.armbian.orig"

# Check the binary we deployed is there. If not, panic, as we've lost control.
# Check the binary we deployed is there. Two reasons it might be missing:
# 1. ARMBIAN_NATIVE_ARMHF_VIA_BINFMT_ELF was active when the matching deploy
# ran, so nothing was copied — graceful no-op.
# 2. Genuine state loss — panic, we lost control.
# We must NOT skip the removal solely on the native-armhf flag, because deploy
# may have run before that flag was set (rootfs-create deploys at line 134,
# native-armhf flips at line 149); skipping the undeploy in that case leaks
# the host's qemu-arm-static into the rootfs cache tarball.
if [[ ! -f "${dst_target}" ]]; then
if [[ "${ARMBIAN_NATIVE_ARMHF_VIA_BINFMT_ELF:-no}" == "yes" ]]; then
display_alert "Native armhf via binfmt_elf" "no qemu binary to remove during ${caller}" "debug"
return 0
fi
exit_with_error "Missing qemu binary during undeploy_qemu_binary_from_chroot from ${caller}"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve target-owned qemu binaries when deploy was skipped

On the cache-hit/image path _native_armhf_setup_binfmt_elf runs before deploy_qemu_binary_to_chroot, so native mode makes deploy skip without creating any .armbian.orig marker. If the rootfs legitimately contains /usr/bin/${QEMU_BINARY} (for example it installs qemu-user-static), this missing-file-only shortcut does not fire and the later undeploy falls through to remove the target's own binary. Track whether this invocation actually deployed a host copy, or no-op all native-mode undeploys that were preceded by a skipped deploy.

Useful? React with 👍 / 👎.

fi

Expand Down Expand Up @@ -132,6 +150,168 @@ function prepare_host_binfmt_qemu() {
return 0
}

# Native armhf on aarch64 host: runtime-disable qemu-arm in binfmt_misc so 32-bit
# ARM ELF falls through to kernel binfmt_elf and runs natively via CONFIG_COMPAT
# (~12× faster than qemu emulation). Killswitch: NATIVE_ARMHF_ON_ARM64=no.
#
# Multi-build coordination is purely kernel-level: each builder holds LOCK_SH on
# /proc/sys/fs/binfmt_misc/qemu-arm; first-arrival `echo 0`, last-out (LOCK_EX-NB
# succeeds → no other SH holders) `echo 1`. No userspace state, no per-builder
# files. Trade-off: an admin's pre-existing `disabled` state is not preserved
# across the build window.

# Read the qemu-arm 'enabled' flag without touching it. Echoes one of:
# 1 — registered and enabled
# 0 — registered and disabled
# missing — not registered
function _native_armhf_observe_qemu_arm_state() {
if [[ ! -e /proc/sys/fs/binfmt_misc/qemu-arm ]]; then
echo "missing"
return 0
fi
if head -1 /proc/sys/fs/binfmt_misc/qemu-arm 2> /dev/null | grep -q '^enabled'; then
echo "1"
else
echo "0"
fi
}

function _native_armhf_setup_binfmt_elf() {
declare killswitch=no
case "${NATIVE_ARMHF_ON_ARM64:-auto}" in
no | never | disabled) killswitch=yes ;;
esac

# Killswitch path: still take SH-lock on qemu-arm so concurrent
# native-armhf builders detect us via EX-NB probe and refuse to switch
# qemu-arm off. Without this anchor an N-builder arriving mid-K-chroot
# would echo 0 and silently break K's qemu-arm-static routing.
if [[ "${killswitch}" == "yes" ]]; then
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Gate the killswitch path to armhf-on-aarch64 builds

Because the killswitch branch runs before the ARCH==armhf and host-architecture checks, any build with NATIVE_ARMHF_ON_ARM64=no calls this path even when it is not an armhf-on-aarch64 build. For example, an unrelated arm64 or amd64 target on the same host will either take the qemu-arm SH lock and block native armhf builders unnecessarily, or exit here if a concurrent native builder already has qemu-arm disabled, even though that build does not need qemu-arm at all. Return before the killswitch handling unless the current build is actually the armhf/aarch64 case.

Useful? React with 👍 / 👎.

if [[ -e /proc/sys/fs/binfmt_misc/qemu-arm ]] &&
{ exec {_native_armhf_emul_lock_fd}< /proc/sys/fs/binfmt_misc/qemu-arm; } 2> /dev/null &&
flock -s -n "${_native_armhf_emul_lock_fd}"; then
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Wait or abort when the killswitch lock is contended

With NATIVE_ARMHF_ON_ARM64=no, if a native builder is briefly holding the exclusive flock while switching qemu-arm, this nonblocking shared flock fails and the killswitch build falls through to emulation without holding the SH anchor. The peer can then downgrade with qemu-arm disabled while this build later enters its chroot expecting qemu-arm-static routing, causing execs to fail; this path should wait/recheck or fail instead of continuing unanchored.

Useful? React with 👍 / 👎.

add_cleanup_handler trap_handler_native_armhf_release_emul_lock
display_alert "Native armhf via binfmt_elf" "killswitch active; emulation-mode SH-lock acquired (blocks concurrent native-armhf switchover)" "info"
else
[[ -n "${_native_armhf_emul_lock_fd:-}" ]] && exec {_native_armhf_emul_lock_fd}>&-
unset _native_armhf_emul_lock_fd
fi
return 1
fi

# Idempotent: callers in rootfs-create.sh and rootfs-image.sh invoke this
# from both the cache-miss and cache-hit paths.
[[ "${ARMBIAN_NATIVE_ARMHF_VIA_BINFMT_ELF:-no}" == "yes" ]] && return 0
[[ "${ARCH}" == "armhf" ]] || return 1
[[ "$(arch)" == "aarch64" ]] || return 1

# Pre-flight is unreliable when qemu-arm is enabled (it interprets the
# arch-test stub); the authoritative check is post-disable below.
if ! arch-test armhf > /dev/null 2>&1; then
display_alert "Native armhf via binfmt_elf" "arch-test pre-flight failed; falling back to qemu-arm-static emulation" "info"
return 1
fi

# qemu-arm not registered → native already active, no anchor needed.
if [[ ! -e /proc/sys/fs/binfmt_misc/qemu-arm ]]; then
display_alert "Native armhf via binfmt_elf" "qemu-arm not registered; native armhf already in effect" "info"
declare -g ARMBIAN_NATIVE_ARMHF_VIA_BINFMT_ELF=yes
return 0
fi

# Group-scoped 2>/dev/null: a bare `exec {fd}< file 2>/dev/null` would
# persistently redirect THIS shell's stderr (since exec without a command
# applies redirections to the current shell), silencing every later
# display_alert that writes to stderr.
if ! { exec {_native_armhf_lock_fd}< /proc/sys/fs/binfmt_misc/qemu-arm; } 2> /dev/null; then
display_alert "Native armhf via binfmt_elf" "cannot open binfmt_misc/qemu-arm; falling back to qemu emulation" "wrn"
return 1
fi

# EX-NB probe BEFORE acquiring our own SH (otherwise our own SH would
# block the probe — flock counts per-OFD, our two fds on the same file
# would interfere). EX-NB success means zero other SH-holders. Failure
# with state="1" identifies a killswitch K-builder holding the
# emulation-mode anchor — switching qemu-arm off would corrupt their
# chroot exec routing. Failure with state="0" is a peer N-builder in
# joiner-territory.
if flock -x -n "${_native_armhf_lock_fd}"; then
flock -u "${_native_armhf_lock_fd}"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Close the flock race before disabling qemu-arm

After the exclusive probe succeeds, unlocking here before the later shared lock acquisition leaves a window where a NATIVE_ARMHF_ON_ARM64=no builder can acquire its shared emulation lock; because shared locks are compatible, this builder will then also take a shared lock and still echo 0, disabling qemu-arm while the killswitch build is relying on it for chroot execution. The probe and the state change need to be atomic with respect to new shared-lock entrants.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Close the flock race before disabling qemu-arm

After the exclusive probe succeeds, unlocking here before the later shared lock acquisition leaves a window where a NATIVE_ARMHF_ON_ARM64=no builder can acquire its shared emulation lock; because flock --help defines -s as shared, -x as exclusive, and -u as removing the lock, this builder will then also take a shared lock and still echo 0, disabling qemu-arm while the killswitch build is relying on it for chroot execution. The probe and the state change need to be atomic with respect to new shared-lock entrants.

Useful? React with 👍 / 👎.

elif [[ "$(_native_armhf_observe_qemu_arm_state)" == "1" ]]; then
exec {_native_armhf_lock_fd}>&-
unset _native_armhf_lock_fd
display_alert "Native armhf via binfmt_elf" "concurrent build holds emulation-mode lock (NATIVE_ARMHF_ON_ARM64=no)" "err"
exit_with_error "cannot enable native armhf: concurrent build with NATIVE_ARMHF_ON_ARM64=no holds emulation lock. Wait for it to finish or run on a separate host."
fi

if ! flock -s -w 30 "${_native_armhf_lock_fd}"; then
display_alert "Native armhf via binfmt_elf" "could not acquire shared flock on binfmt_misc/qemu-arm within 30s; falling back to qemu emulation" "wrn"
exec {_native_armhf_lock_fd}>&-
unset _native_armhf_lock_fd
return 1
fi

if [[ "$(_native_armhf_observe_qemu_arm_state)" == "1" ]]; then
if ! echo 0 > /proc/sys/fs/binfmt_misc/qemu-arm 2> /dev/null; then
display_alert "Native armhf via binfmt_elf" "could not disable qemu-arm (no CAP_SYS_ADMIN?); falling back to qemu-arm-static emulation" "wrn"
exec {_native_armhf_lock_fd}>&-
unset _native_armhf_lock_fd
return 1
fi
fi

# Register cleanup BEFORE the authoritative arch-test, so a failure
# there still releases the lock via the trap handler.
add_cleanup_handler trap_handler_native_armhf_restore_qemu_arm
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Fix cleanup handler ordering before restoring qemu-arm

For an interrupted arm64→armhf build after this handler is registered, it runs before trap_handler_cleanup_rootfs_and_image, not after it: add_cleanup_handler prepends callbacks and run_cleanup_handlers iterates that array in order (lib/functions/logging/traps.sh:118-140). That means the native-armhf lock can be released and qemu-arm re-enabled while the chroot/container teardown is still pending, which is exactly the inherited-fd/descendant case this code says must be avoided and can route remaining target execs through qemu unexpectedly.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Fix cleanup handler ordering before restoring qemu-arm

When activation happens from the rootfs/image path rather than the earlier host-prep path, this handler is added after trap_handler_cleanup_rootfs_and_image and therefore runs before it: add_cleanup_handler prepends callbacks and run_cleanup_handlers iterates that array in order (lib/functions/logging/traps.sh:118-140). In that case an interrupted arm64→armhf build can release the native-armhf lock and re-enable qemu-arm while chroot/container teardown is still pending, which is exactly the inherited-fd/descendant case this code says must be avoided.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Restore qemu-arm after rootfs cleanup has killed children

This cleanup is registered with add_cleanup_handler, which prepends handlers, while the earlier rootfs cleanup handler that unmounts/kills chroot users remains later in the list and mount_chroot does not add another handler. On SIGINT or a failing chroot command, trap_handler_native_armhf_restore_qemu_arm therefore runs before rootfs cleanup; any still-running child that inherited the flock fd keeps the SH lock alive, the last-out EX probe fails, and qemu-arm can be left globally disabled after the build exits.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Restore qemu-arm after rootfs cleanup runs

This cleanup is registered after prepare_rootfs_build_params_and_trap has already registered trap_handler_cleanup_rootfs_and_image, and add_cleanup_handler prepends handlers, so this restore handler runs before the rootfs cleanup that unmounts/kills chroot work. On an interrupt while a chroot/container child still has the inherited flock fd open, the last-out LOCK_EX -n probe sees that inherited holder and skips re-enabling qemu-arm, leaving the host with arm emulation disabled after the build exits.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Restore qemu-arm after rootfs cleanup handlers

When a native armhf build is interrupted or fails while chroot work is still running, this cleanup is prepended after trap_handler_cleanup_rootfs_and_image, so run_cleanup_handlers executes it first (see lib/functions/logging/traps.sh lines 117-140). Because BSD flock locks are inherited by forked chroot/subshell processes, releasing this fd before the rootfs cleanup kills/unmounts those descendants makes the last-out LOCK_EX -n probe see the inherited shared lock and skip echo 1, leaving /proc/sys/fs/binfmt_misc/qemu-arm disabled on the host.

Useful? React with 👍 / 👎.


# Post-disable check is authoritative: arch-test now faces what the
# chroot exec will face. False-positive if host kernel lacks COMPAT_VDSO
# (see extensions/arm64-compat-vdso, PR #9284).
if ! arch-test armhf > /dev/null 2>&1; then
display_alert "Native armhf via binfmt_elf" "post-disable verification failed (host kernel lacks COMPAT_VDSO — see extensions/arm64-compat-vdso); restoring and falling back to emulation" "wrn"
trap_handler_native_armhf_restore_qemu_arm
return 1
fi

display_alert "Native armhf via binfmt_elf" "kernel $(uname -r), aarch64 host with COMPAT_VDSO; qemu-arm disabled, kernel binfmt_elf takes over" "info"
declare -g ARMBIAN_NATIVE_ARMHF_VIA_BINFMT_ELF=yes
return 0
}

# Killswitch path cleanup: just release the SH-lock fd. No state mutation,
# no last-out detection — the killswitch builder never wrote to qemu-arm.
function trap_handler_native_armhf_release_emul_lock() {
[[ -n "${_native_armhf_emul_lock_fd:-}" ]] || return 0
exec {_native_armhf_emul_lock_fd}>&-
unset _native_armhf_emul_lock_fd
}

# Cleanup ordering invariant: this handler must run AFTER cleanups that kill
# the build's subshells (umount / SDCARD / MOUNT teardown). BSD flock is per-
# OFD, so a forked subshell inheriting our SH-fd shares the same lock entry —
# the LOCK_EX-NB probe below would falsely block on the inherited fd of a
# still-alive child. add_cleanup_handler runs in registration order; the
# umount handlers register first, so by the time we run, the docker container
# is dead and its child-tree with it. Verified empirically (SIGINT mid-chroot).
function trap_handler_native_armhf_restore_qemu_arm() {
[[ -n "${_native_armhf_lock_fd:-}" ]] || return 0
exec {_native_armhf_lock_fd}>&-
Comment thread
iav marked this conversation as resolved.
unset _native_armhf_lock_fd
Comment on lines +392 to +393
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Serialize restore before releasing native lock

When the last native armhf builder exits while a NATIVE_ARMHF_ON_ARM64=no builder is starting, closing the shared fd here creates a gap before the last_fd exclusive probe below. I checked flock --help: -s takes a shared lock, -x an exclusive lock, and -n fails on conflict; in that gap the killswitch builder can acquire SH, observe qemu-arm is still 0, and while it is exiting/releasing, this handler's EX-NB probe can fail, leaving no remaining owner to write echo 1 and restoring never happens. Fresh evidence is that the current diff still closes _native_armhf_lock_fd before opening/probing last_fd, so the last-out transition is not serialized.

Useful? React with 👍 / 👎.


[[ -e /proc/sys/fs/binfmt_misc/qemu-arm ]] || return 0

# Group-scoped 2>/dev/null on the exec — see _native_armhf_setup_binfmt_elf.
declare last_fd
if ! { exec {last_fd}< /proc/sys/fs/binfmt_misc/qemu-arm; } 2> /dev/null; then
return 0
fi
if flock -x -n "${last_fd}"; then
echo 1 > /proc/sys/fs/binfmt_misc/qemu-arm 2> /dev/null || true
display_alert "Native armhf via binfmt_elf" "last out; qemu-arm restored to enabled" "info"
fi
exec {last_fd}>&-
}

# The actual binfmt manipulations when cross-build is confirmed above.
function prepare_host_binfmt_qemu_cross() {
local failed_binfmt_modprobe=0
Expand Down Expand Up @@ -179,6 +359,36 @@ function prepare_host_binfmt_qemu_cross() {
continue
fi

# Skip wanted_arch=arm preparation entirely when this build doesn't
# target armhf. The Apple-Silicon helper below mutates global kernel
# binfmt_misc/qemu-arm state, which is irrelevant for cross builds
# targeting amd64/riscv64/etc and would needlessly race with any
# concurrent native-armhf owner on the host.
if [[ "${host_arch}" == "aarch64" && "${wanted_arch}" == "arm" && "${ARCH}" != "armhf" ]]; then
display_alert "binfmt qemu-arm" "skipped: target ARCH=${ARCH} doesn't need qemu-arm" "debug"
continue
fi

# Early native-armhf claim. On aarch64 host targeting armhf, try to
# become or join the native-armhf-via-binfmt_elf owner BEFORE the
# Apple-Silicon special branch below. The latter mutates global kernel
# binfmt_misc state via update-binfmts, which races against another
# concurrent build that holds qemu-arm in its disabled state. Joining
# (or becoming first) keeps qemu-arm disabled coherently and lets
# /usr/share/binfmts/qemu-arm absence in this container be a non-issue.
if [[ "${host_arch}" == "aarch64" && "${wanted_arch}" == "arm" && "${ARCH}" == "armhf" ]]; then
if _native_armhf_setup_binfmt_elf; then
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Don’t switch to native mode before mmdebstrap

On an aarch64 armhf cache-miss build with the normal PRE_PREPARED_HOST!=yes flow, prepare_host runs before create_new_rootfs_cache_via_debootstrap, so this early call disables qemu-arm and sets ARMBIAN_NATIVE_ARMHF_VIA_BINFMT_ELF=yes before the rootfs exists. The subsequent rootfs creation then skips deploy_qemu_binary_to_chroot at line 134 while running mmdebstrap --arch=armhf; this contradicts the later activation point added after mmdebstrap and leaves the bootstrap phase without the qemu registration/binary it relies on. Keep host preparation in emulation mode and only call _native_armhf_setup_binfmt_elf after mmdebstrap has populated libc/ld-linux.

Useful? React with 👍 / 👎.

display_alert "binfmt qemu-arm" "skipped: native armhf via binfmt_elf is active" "cachehit"
continue
fi
# qemu-arm disabled means another builder native-owns it; route
# through the guard so we fail fast instead of clobbering.
if [[ "$(_native_armhf_observe_qemu_arm_state)" == "0" ]]; then
prepare_host_binfmt_qemu_cross_arm64_host_armhf_target
continue
fi
fi

if [[ ! -e "/proc/sys/fs/binfmt_misc/qemu-${wanted_arch}" || ! -e "/usr/share/binfmts/qemu-${wanted_arch}" ]]; then
display_alert "Updating binfmts" "update-binfmts --enable qemu-${wanted_arch}" "debug"

Expand All @@ -193,6 +403,22 @@ function prepare_host_binfmt_qemu_cross() {
}

function prepare_host_binfmt_qemu_cross_arm64_host_armhf_target() {
# Conservative guard: refuse to mutate global qemu-arm state if it is
# observably disabled. That state means another concurrent armbian build
# is using the native-armhf path and we'd clobber it by re-enabling
# qemu-arm here. (Reachable only via NATIVE_ARMHF_ON_ARM64=no/never/
# disabled opt-out — otherwise _native_armhf_setup_binfmt_elf would have
# already exit'd with the "concurrent native-armhf build" error before
# we got here.)
if [[ -e /proc/sys/fs/binfmt_misc/qemu-arm ]]; then
declare observed_qemu_arm
observed_qemu_arm="$(_native_armhf_observe_qemu_arm_state)"
if [[ "${observed_qemu_arm}" == "0" ]]; then
display_alert "binfmt qemu-arm" "registered but observably disabled — another concurrent build likely holds native-armhf; refusing to clobber" "err"
exit_with_error "qemu-arm globally disabled by another concurrent build; cannot safely re-enable. Wait for it to finish or run on a separate host."
fi
fi

display_alert "Trying to update binfmts - aarch64 mostly does 32-bit sans emulation, but Apple said no" "update-binfmts --enable qemu-${wanted_arch}" "debug"
run_host_command_logged update-binfmts --enable "qemu-${wanted_arch}" "&>" "/dev/null" "||" "true" # don't fail nor produce output, which can be misleading.

Expand All @@ -201,12 +427,13 @@ function prepare_host_binfmt_qemu_cross_arm64_host_armhf_target() {
run_host_command_logged arch-test "||" true
fi

# to check, we use arch-test; if will return 0 if _either_ the host can natively run armhf, or if qemu-arm is correctly working.
if arch-test arm; then
# to check, we use arch-test; will return 0 if _either_ the host can natively run armhf, or if qemu-arm is correctly working.
# Use armhf (Debian-arch) rather than arm to match the build target and the post-disable check in _native_armhf_setup_binfmt_elf.
if arch-test armhf; then
Comment thread
iav marked this conversation as resolved.
Outdated
display_alert "Host can run armhf natively or emulation is correctly setup already" "no need to enable qemu-arm" "debug"
else
display_alert "arm64 host can't run armhf natively" "importing enabling qemu-arm" "debug"
cat <<-BINFMT_ARM_MAGIC >/usr/share/binfmts/qemu-arm
cat <<- BINFMT_ARM_MAGIC > /usr/share/binfmts/qemu-arm
package qemu-user-static
interpreter /usr/bin/qemu-arm-static
magic \x7f\x45\x4c\x46\x01\x01\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\x28\x00
Expand All @@ -221,7 +448,7 @@ function prepare_host_binfmt_qemu_cross_arm64_host_armhf_target() {

# Test again using arch-test.
display_alert "Checking if arm 32-bit emulation on arm64 works after enabling" "qemu-arm emulation" "info"
run_host_command_logged arch-test arm
run_host_command_logged arch-test armhf
display_alert "arm 32-bit emulation on arm64" "has been correctly setup" "cachehit"
fi
}
9 changes: 7 additions & 2 deletions lib/functions/rootfs/rootfs-create.sh
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ function create_new_rootfs_cache_via_debootstrap() {
local debootstrap_apt_mirror="http://localhost:3142/${APT_MIRROR}"
acng_check_status_or_restart
;;
no) ;& # do nothing, fallthrough
no) ;& # do nothing, fallthrough
"")
: # still do nothing
;; # stop falling
Expand Down Expand Up @@ -139,9 +139,14 @@ function create_new_rootfs_cache_via_debootstrap() {

skip_target_check="yes" local_apt_deb_cache_prepare "for mmdebstrap" # just for size reference in logs


[[ ! -f "${SDCARD}/bin/bash" ]] && exit_with_error "mmdebstrap did not produce /bin/bash"

# mmdebstrap done, libc/ld-linux are in ${SDCARD}. Disable qemu-arm in binfmt_misc
# so subsequent chroot apt-get/dpkg/customize calls fall through to kernel binfmt_elf
# and run 32-bit ARM ELF natively via CONFIG_COMPAT. mmdebstrap above used qemu-arm
# because its cross-arch path requires that registration to be present.
_native_armhf_setup_binfmt_elf || true
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Keep qemu-arm enabled while peers bootstrap

A concurrent armhf build can still be inside the mmdebstrap command above when this build reaches the post-mmdebstrap switch and disables the global qemu-arm registration. Those bootstrapping peers have not called _native_armhf_setup_binfmt_elf yet, so they hold no SH lock or killswitch anchor, but their cross-arch mmdebstrap path still depends on qemu-arm; the first cache-miss build to finish bootstrap can therefore break another cache-miss build that is still running mmdebstrap.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Anchor qemu-arm before mmdebstrap can be interrupted

When two armhf builds run on the same aarch64 host, this disables the global qemu-arm registration as soon as one build finishes mmdebstrap, but another build may still be inside the mmdebstrap call above and still needs that registration for its maintainer-script execution. Because no emulation SH-lock is held before/during mmdebstrap, the first finisher can switch the host to native mode and break the peer's bootstrap. The native-owner transition needs to be serialized against builders that have not yet left mmdebstrap, not only after this line.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Protect peer mmdebstrap before disabling qemu-arm

In concurrent armhf cache-miss builds, one build can reach this post-mmdebstrap setup and globally disable qemu-arm while another build is still inside its own run_host_command_logged "${debootstrap_bin}" step above. That peer has already deployed the qemu binary but has not reached the point where libc/ld-linux are guaranteed usable natively, so its remaining mmdebstrap maintainer-script executions can suddenly stop going through qemu-arm and fail mid-bootstrap.

Useful? React with 👍 / 👎.


# Done with mmdebstrap. Clean-up its litterbox.
display_alert "Cleaning up after mmdebstrap" "mmdebstrap cleanup" "info"
run_host_command_logged rm -rf "${SDCARD}/var/cache/apt" "${SDCARD}/var/lib/apt/lists"
Expand Down
Loading