-
Notifications
You must be signed in to change notification settings - Fork 16
Fix first run of elemental failing in Macbook podman #406
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -22,15 +22,18 @@ import ( | |
| "encoding/json" | ||
| "fmt" | ||
| "io" | ||
| "os" | ||
| "path/filepath" | ||
| "regexp" | ||
| "runtime" | ||
| "strings" | ||
| "text/template" | ||
|
|
||
| "github.com/suse/elemental/v3/pkg/block/lsblk" | ||
| "github.com/suse/elemental/v3/pkg/deployment" | ||
| "github.com/suse/elemental/v3/pkg/sys" | ||
| "github.com/suse/elemental/v3/pkg/sys/vfs" | ||
| "golang.org/x/sys/unix" | ||
| ) | ||
|
|
||
| const ( | ||
|
|
@@ -190,6 +193,8 @@ func repartDisk(s *sys.System, d *deployment.Disk, empty string) (err error) { | |
| // the optional given flags. On success it parses systemd-repart output to get the generated partition UUIDs and update the | ||
| // given partitions list with them. | ||
| func runSystemdRepart(s *sys.System, target string, parts []Partition, flags ...string) error { | ||
| setupLoopDeviceNodes() | ||
|
|
||
| dir, err := vfs.TempDir(s.FS(), "", "elemental-repart.d") | ||
| if err != nil { | ||
| return fmt.Errorf("failed creating a temporary directory for systemd-repart configuration: %w", err) | ||
|
|
@@ -289,3 +294,15 @@ func readOnlyPart(part *deployment.Partition) string { | |
| } | ||
| return "" | ||
| } | ||
|
|
||
| // setupLoopDeviceNodes creates 4 loop device nodes for systemd-repart to use at runtime. This is only necessary on arm64 | ||
| // podman containers as, on first run, the container does not have enough loop device nodes available for systemd-repart to work. | ||
| // This is not an issue in amd64 containers because enough loop device nodes are automatically available. | ||
| func setupLoopDeviceNodes() { | ||
| if os.Getenv("container") != "" && runtime.GOARCH == "arm64" { | ||
|
dirkmueller marked this conversation as resolved.
|
||
| for i := 0; i < 4; i++ { | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why 4? To my understanding this should be as big as the length of the partitions slice.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 4 is an arbitrary number that just allows it to work, based on the default settings (not sure how it works with further customization) I believe it only needs 1 device node |
||
| devPath := fmt.Sprintf("/dev/loop%d", i) | ||
| _ = unix.Mknod(devPath, unix.S_IFBLK|0660, 7*256+i) | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Out of curiosity, what happens if the device already exists? errors out or overwrites it? Probably it is less intrusive if we just create the missing ones.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I believe it silently fails without any issue |
||
| } | ||
| } | ||
| } | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say we can simply pass the partitions number as an argument here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that's entirely necessary as I don't believe they are tied or linked to each other. But I will hijack this comment to outline a comprehensive view of the flow and the issue:
On a Mac, if we do
podman machine sshand then runls -la /dev/loop*this is the output we see:Now if we enter into a an elemental3 container that we built like so:
podman run -it --privileged \ --entrypoint /bin/sh \ -v $PWD/examples/elemental/customize/linux-only/:/config \ -v /run/podman/podman.sock:/var/run/docker.sock \ local/elemental-image:v3.0.0-alpha.20251212-g93d3598So that we have shell access, and then we run
ls -la /dev/loop*this is what we see:sh-5.3# ls -la /dev/loop* crw-rw---- 1 root 6 10, 237 Apr 8 15:01 /dev/loop-controlSo by default, no loop device nodes are created inside of the podman virtual machine on Macs
so now if within that same container we run
elemental3 --debug customize --type raw --localWe will fail with the known issue:
After this failure, let's check the podman vm and container states:
What's interesting here is
/dev/loop0is present within the podman vm, but not within the container. However, if we start a new container using the same command as before:sh-5.3# ls -la /dev/loop* crw-rw---- 1 root 6 10, 237 Apr 8 15:06 /dev/loop-control brw-rw---- 1 root 6 7, 0 Apr 8 15:06 /dev/loop0We see that
/dev/loop0is present in the container and elemental will succeed and the/dev/loop0will remain on the container:DEBU[0075] systemd-repart output to parse: [ { ... }, { ... } ] INFO[0097] Customize complete DEBU[0097] Cleaning up working directory sh-5.3# ls -la /dev/loop* crw-rw---- 1 root 6 10, 237 Apr 8 15:06 /dev/loop-control brw-rw---- 1 root 6 7, 0 Apr 8 15:09 /dev/loop0But it is no longer present in the podman vm:
However, if you rerun elemental within that same container, it will continue to succeed indefinitely as, even though the loop device node is cleaned up from the podman vm, it remains on the current container
So I believe what's happening is:
So the proposed solution bypasses the need for the loop device to already pre-exist on the podman vm when the container is first ran, and also the need for it to be repopulated each time for each new container
This is why I don't think the loop device nodes are tied to the partitions or anything specific, I chose 4 as an arbitrary number, but it could be more or less, I think even 1 should work, the only condition that I haven't checked is if there are instances where
systemd-repartmight need more than 1 loop device nodeThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, I had the impression a loop device was used in each partition. I guess we can leave it as is then.