Skip to content

fix(rootfs/qemu-static): use direct armhf exec probe instead of broken arch-test gate#9820

Open
iav wants to merge 1 commit into
armbian:mainfrom
iav:fix/native-armhf-detection
Open

fix(rootfs/qemu-static): use direct armhf exec probe instead of broken arch-test gate#9820
iav wants to merge 1 commit into
armbian:mainfrom
iav:fix/native-armhf-detection

Conversation

@iav
Copy link
Copy Markdown
Contributor

@iav iav commented May 14, 2026

Problem

prepare_host_binfmt_qemu_cross_arm64_host_armhf_target (added in 6755e9190, 2024-12-28, for Apple Silicon) gates on arch-test arm. On success → skip qemu-arm import; on failure → write a hand-rolled /usr/share/binfmts/qemu-arm magic with interpreter /usr/bin/qemu-arm-static and enable it.

This is broken on most modern aarch64 hosts:

  • arch-test arm probes ARMv5 EABI. Current toolchains and the kernel's COMPAT layer target armhf (EABI v5+); the two surfaces diverge.
  • On Ubuntu Noble 6.8 / Ampere Altra (Hetzner CAX), arch-test arm returns failure even with CONFIG_COMPAT=y and a fully-working COMPAT path. The gate then unconditionally routes every aarch64 host through the Apple-Silicon import branch — qemu-arm gets registered, mmdebstrap and chroot apt-get / dpkg --configure / customize_image run through qemu-user-static emulation, ~10× slower than native.
  • The unconditional update-binfmts --enable qemu-arm at function entry could re-enable a pre-existing registration even on hosts the operator deliberately wanted to keep without qemu.

Codex (iav#117 P1 round 1) also flagged: the hand-rolled descriptor overwrites a packaged /usr/share/binfmts/qemu-arm (resolute's qemu-user-binfmt ships /usr/bin/qemu-arm without -static) → enable fails → armhf builds break on resolute Apple-Silicon-like hosts.

Empirically: helios4 BUILD_MINIMAL=yes RELEASE=trixie PREFER_DOCKER=yes finishes in 1:07 min on Hetzner CAX21 when qemu-arm stays unregistered (mmdebstrap native via binfmt_elf), vs ~10× longer when emulation routes mmdebstrap.

Fix

Replace the single arch-test arm gate with a 4-tier detection that picks the safest path on each host:

  1. /usr/share/binfmts/qemu-arm exists (packaged by qemu-user-binfmt on resolute, or by qemu-user-static install): update-binfmts --enable qemu-arm — loads the packaged descriptor into the kernel. Preserves the packaged interpreter path. Hard-error if enable fails (descriptor present but broken — host needs operator attention).
  2. /proc/sys/fs/binfmt_misc/qemu-arm exists without descriptor: trust if enabled, hard-error if disabled (no descriptor to re-enable from).
  3. No registration: probe with a direct exec /usr/arm-linux-gnueabihf/lib/ld-linux-armhf.so.3 --help (ld-linux-armhf is shipped by gcc-arm-linux-gnueabihf, an armbian aarch64 host build dep). Authoritative test of CONFIG_COMPAT functionality. If succeeds → kernel runs armhf natively → no qemu setup needed.
  4. Native probe fails (Apple Silicon / no COMPAT): hand-roll /usr/share/binfmts/qemu-arm magic with /usr/bin/qemu-arm-static, import, enable — the original behaviour, now only reached when actually needed.

Also removed the unconditional update-binfmts --enable qemu-arm "|| true" at function entry — its job is subsumed by step (1).

Empirical validation

  • Native aarch64 (CAX21 / Ampere Altra, Ubuntu Noble 6.8): probe succeeds, no qemu-arm registered, helios4 trixie minimal builds 0:57 (PREFER_DOCKER=no) / 1:07 (PREFER_DOCKER=yes), mmdebstrap native.
  • No-CONFIG_COMPAT kernel (custom helios64 build with opts_n+=("COMPAT" "COMPAT_VDSO" "ARM64_32BIT_EL0")): ld-linux-armhf.so.3 execs with Exec format error; probe correctly fails → step (4) fires.
  • Pre-existing qemu-arm registration (manually added /usr/share/binfmts/qemu-arm with /usr/bin/qemu-arm-static): step (1) early-returns, descriptor md5 unchanged before/after, kernel entry left enabled.

Test plan

  • Hetzner CAX21 Noble 6.8 — native aarch64 with COMPAT, mmdebstrap native, helios4 image OK.
  • helios64 no-COMPAT kernel — ld-linux-armhf exec → "Exec format error", step (4) reachable.
  • Manual qemu-arm registration → step (1) preserves descriptor + enabled state, no clobber.
  • Resolute aarch64 with qemu-user-binfmt package — covered by inspection (step 1 enables packaged descriptor). No resolute Apple Silicon hardware available; happy to defer to community verification.

Assisted-by: Claude:claude-opus-4.7

Summary by CodeRabbit

  • Bug Fixes
    • Improved reliability of ARM cross-compilation setup with a more robust verification and initialization process.
    • Enhanced error messaging to provide clearer guidance when configuration issues occur.
    • Optimized detection of native vs. emulated execution capabilities to streamline setup procedures.

Review Change Stack

@iav iav requested a review from a team as a code owner May 14, 2026 00:19
@iav iav requested review from ColorfulRhino and schwar3kat and removed request for a team May 14, 2026 00:19
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 14, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 510f5594-8af7-4603-964d-c337739a202c

📥 Commits

Reviewing files that changed from the base of the PR and between d5b76d4 and 4be6288.

📒 Files selected for processing (1)
  • lib/functions/rootfs/qemu-static.sh

📝 Walkthrough

Walkthrough

The function probes for native armhf execution, prefers a packaged qemu-arm binfmt descriptor or an enabled kernel binfmt entry, and only writes/imports and enables /usr/share/binfmts/qemu-arm when needed; post-setup arch-test calls are debug-only.

Changes

QEMU ARMhf native capability probe and setup

Layer / File(s) Summary
Trust existing enabled binfmt_misc entry
lib/functions/rootfs/qemu-static.sh
Return early when /proc/sys/fs/binfmt_misc/qemu-arm exists and its state is enabled.
Probe armhf dynamic loader for native support
lib/functions/rootfs/qemu-static.sh
Execute /usr/arm-linux-gnueabihf/lib/ld-linux-armhf.so.3 --help as an execution probe; if it succeeds, return without configuring qemu-arm.
Enable packaged qemu-arm descriptor
lib/functions/rootfs/qemu-static.sh
If /usr/share/binfmts/qemu-arm exists, run update-binfmts --enable qemu-arm, sync the kernel entry, and abort with reinstall/remove-and-retry instructions if it does not become enabled.
Abort when kernel entry disabled but no packaged descriptor
lib/functions/rootfs/qemu-static.sh
If kernel qemu-arm exists but is disabled and no packaged descriptor is present, echo explicit instructions and abort with a retry message.
Write, import and enable qemu-arm descriptor
lib/functions/rootfs/qemu-static.sh
On probe failure and no re-enable option, write /usr/share/binfmts/qemu-arm (interpreter /usr/bin/qemu-arm-static with ELF magic/mask/flags), then update-binfmts --import and --enable it.
Debug-only arch-test and final log
lib/functions/rootfs/qemu-static.sh
Keep a debug-only arch-test invocation that discards failures and update the final log to state arm emulation was set up via qemu-arm.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

I poke ld-linux to see if it hops,
If kernels can run armhf without props,
If not, I scribble a binfmt line,
Import and enable, then all should align—
A rabbit hums while cross-builds top.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly describes the main change: replacing an arch-test gate with direct armhf exec probe for qemu-static setup.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions Bot added size/medium PR with more then 50 and less then 250 lines 05 Milestone: Second quarter release Needs review Seeking for review Framework Framework components labels May 14, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
lib/functions/rootfs/qemu-static.sh (1)

232-233: 💤 Low value

Hardcode qemu-arm instead of inheriting wanted_arch from caller scope.

This function's name pins the target arch (armhf_target), and the only call site (line 187) enters when wanted_arch == "arm". Relying on the caller's loop variable here couples the helper to a particular invocation context without any guard — a future caller or refactor could silently produce qemu- with no arch suffix. Using the literal string also matches the heredoc above (line 222) which is already arm-specific.

♻️ Suggested change
-	run_host_command_logged update-binfmts --import "qemu-${wanted_arch}"
-	run_host_command_logged update-binfmts --enable "qemu-${wanted_arch}"
+	run_host_command_logged update-binfmts --import "qemu-arm"
+	run_host_command_logged update-binfmts --enable "qemu-arm"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@lib/functions/rootfs/qemu-static.sh` around lines 232 - 233, The helper
function armhf_target currently calls run_host_command_logged with
"qemu-${wanted_arch}", inheriting wanted_arch from caller scope; change these
two calls to use the literal "qemu-arm" instead of interpolating wanted_arch so
the helper is self-contained and matches the arm-specific heredoc and intent.
Update both run_host_command_logged update-binfmts --import and --enable
invocations to use "qemu-arm" and ensure no other references inside armhf_target
rely on wanted_arch.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@lib/functions/rootfs/qemu-static.sh`:
- Around line 232-233: The helper function armhf_target currently calls
run_host_command_logged with "qemu-${wanted_arch}", inheriting wanted_arch from
caller scope; change these two calls to use the literal "qemu-arm" instead of
interpolating wanted_arch so the helper is self-contained and matches the
arm-specific heredoc and intent. Update both run_host_command_logged
update-binfmts --import and --enable invocations to use "qemu-arm" and ensure no
other references inside armhf_target rely on wanted_arch.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 0f4f347d-a3de-47de-91fe-fa6e149fa4d8

📥 Commits

Reviewing files that changed from the base of the PR and between 761d04e and 4a8209d.

📒 Files selected for processing (1)
  • lib/functions/rootfs/qemu-static.sh

@iav iav force-pushed the fix/native-armhf-detection branch 2 times, most recently from 934a2f0 to d5b76d4 Compare May 14, 2026 02:15
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
lib/functions/rootfs/qemu-static.sh (1)

218-226: 💤 Low value

Probe path is gated solely on the cross-toolchain loader, missing fallback for foreign architecture setups.

The probe at line 222 checks /usr/arm-linux-gnueabihf/lib/ld-linux-armhf.so.3 (installed by gcc-arm-linux-gnueabihf). If this cross-toolchain package is absent—possible in pre-prepared hosts or manual foreign architecture setups—the probe fails silently and falls through to write a hand-rolled qemu-arm descriptor and force emulation, defeating this PR's optimization.

In normal builds, gcc-arm-linux-gnueabihf is installed during host preparation when the target includes armhf, so this is not a typical correctness issue. However, hosts with armhf added as a foreign dpkg architecture (without the cross-toolchain) could use the alternative loader path /lib/arm-linux-gnueabihf/ld-linux-armhf.so.3 to detect native capability without additional packages. Consider probing both paths to support this setup, or at minimum document the gcc-arm-linux-gnueabihf dependency at the call site so users debugging slow builds can identify whether they need to install it.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@lib/functions/rootfs/qemu-static.sh` around lines 218 - 226, The armhf kernel
probe relies only on /usr/arm-linux-gnueabihf/lib/ld-linux-armhf.so.3 via the
armhf_probe check and will miss hosts that have armhf as a foreign dpkg
architecture where the loader lives at
/lib/arm-linux-gnueabihf/ld-linux-armhf.so.3; update the probe logic used around
the armhf_probe variable to check both paths
("/usr/arm-linux-gnueabihf/lib/ld-linux-armhf.so.3" and
"/lib/arm-linux-gnueabihf/ld-linux-armhf.so.3") and invoke the same executable
--help test for whichever exists, falling back to the existing qemu-arm
descriptor logic only if neither path yields an executable; alternatively add a
clear comment/log entry (via display_alert) documenting the
gcc-arm-linux-gnueabihf dependency if you prefer not to add the second-path
probe.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@lib/functions/rootfs/qemu-static.sh`:
- Around line 218-226: The armhf kernel probe relies only on
/usr/arm-linux-gnueabihf/lib/ld-linux-armhf.so.3 via the armhf_probe check and
will miss hosts that have armhf as a foreign dpkg architecture where the loader
lives at /lib/arm-linux-gnueabihf/ld-linux-armhf.so.3; update the probe logic
used around the armhf_probe variable to check both paths
("/usr/arm-linux-gnueabihf/lib/ld-linux-armhf.so.3" and
"/lib/arm-linux-gnueabihf/ld-linux-armhf.so.3") and invoke the same executable
--help test for whichever exists, falling back to the existing qemu-arm
descriptor logic only if neither path yields an executable; alternatively add a
clear comment/log entry (via display_alert) documenting the
gcc-arm-linux-gnueabihf dependency if you prefer not to add the second-path
probe.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 688ab1df-44ba-4174-9a39-f917a6a0ebf8

📥 Commits

Reviewing files that changed from the base of the PR and between 934a2f0 and d5b76d4.

📒 Files selected for processing (1)
  • lib/functions/rootfs/qemu-static.sh

@iav iav force-pushed the fix/native-armhf-detection branch from d5b76d4 to 4be6288 Compare May 14, 2026 02:49
@iav
Copy link
Copy Markdown
Contributor Author

iav commented May 14, 2026

Re: nitpick "Probe path is gated solely on the cross-toolchain loader, missing fallback for foreign architecture setups."

Verified: gcc-arm-linux-gnueabihf is added to host_dependencies in lib/functions/host/prepare-host.sh:258-260 for any build with wanted_arch ∈ {armhf, all}, and install_host_dependencies runs (line 77 of prepare_host_noninteractive) before prepare_host_binfmt_qemu (line 98). The probe binary /usr/arm-linux-gnueabihf/lib/ld-linux-armhf.so.3 is therefore guaranteed to be in place when this helper fires in the normal flow.

The "absent" case the nitpick raises applies only when the operator sets PRE_PREPARED_HOST=yes (or equivalent), which is an explicit opt-out from armbian's host-dep management — the admin is on the hook for installing the toolchain that matches their target arch. If we wanted to be helpful in that path, a clearer message would be a separate concern (a check around assert_prepared_host rather than in the binfmt helper). Leaving this PR scoped to the original gate-replacement.

@iav
Copy link
Copy Markdown
Contributor Author

iav commented May 14, 2026

@rpardini

@igorpecovnik
Copy link
Copy Markdown
Member

Should we merge this after 05 release?

@igorpecovnik igorpecovnik added 08 Milestone: Third quarter release and removed 05 Milestone: Second quarter release labels May 16, 2026
@iav
Copy link
Copy Markdown
Contributor Author

iav commented May 16, 2026

I think it could be done earlier.
But of course I don't see the whole situation.

@igorpecovnik
Copy link
Copy Markdown
Member

Depends how well is this tested? I want as less surprises as possible when hitting the "build all" button.

@iav
Copy link
Copy Markdown
Contributor Author

iav commented May 16, 2026

Indeed — my testing surface doesn't cover CI or the big builder fleet. Agree on holding the merge until builder hiccups can't threaten a release cut.

@igorpecovnik
Copy link
Copy Markdown
Member

Agree on holding the merge until builder hiccups can't threaten a release cut.

Yep. After release we can afford to have broken CI for days while before its really not needed :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

08 Milestone: Third quarter release Framework Framework components Needs review Seeking for review size/medium PR with more then 50 and less then 250 lines

Development

Successfully merging this pull request may close these issues.

2 participants