Re: [boot-time]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 1/12/25 04:11, Marko Hoyer wrote:
Am 12.01.25 um 02:03 schrieb Rob Landley:
On 1/11/25 12:57, Bird, Tim wrote:
Hey Rob, This is a great review of /dev, /sys and the different
ways that /dev gets populated.

Feel free to link stuff from wikis or some such. The newest of those documents was written in 2007.

For a lot of embedded Linux devices, the only bus where
new items can show up dynamically is USB.

SDCARD readers connected via MMC are common in automtove head units as well ...

But do they give an insertion/removal notification that can generate an interrupt rather than needing to be polled? (Last couple of boards I poked at didn't, but it was cheap hardware...)

When a driver DOESN'T automatically bind to them it gets a bit complicated, and one of the things mdev can be configured to do is act as a firmware loader! Which is just... Ahem, there are YEARS of poor design decisions the kernel guys made, where they ignored a mechanism they already had an implemented something more complicated. The mechanism whereby the kernel opens a firmware file and read it directly out of the filesystem instead of calling a hotplug helper was... I'm just going to gloss over that.

WIFI & Bluetooth devices often use this firmware mechanism.

The wifi and bluetooth _hardware_ is always there though. Transciever link toggle is more or less a media insertion/removal event, which is a slightly different hotplug mechanism.

Ogres, onions... layers.

And yes I agree, it looks a bit ** ugly** seeing the kernel loading a firmware file from /lib/firmware  searching it in the root file system w/o knowing the state of it during boot ...

They already HAD the hotplug helper mechanism and initramfs! You could already CALL A LOADER and some of us had that working and DEPLOYED before they built a whole new mechanism for "the kernel reaches out and reads a file out of the userspace view of the filesystem from kernel space without a process context to do it in like the ELF loader has, don't ask me what this means for containers and namespaces..."

(Ok, they wanted to load firmware before PID 1 launched, but they were already breaking the drivers into separate probe/init sections so you could probe before were started and init after interrupts were started and launching PID 1 is the first thing that happens after interrupts are enabled (we have a scheduler now, the idle task can fork off PID 1 and PID 0 can run pause() in a loop. Except between those two the kernel launches a zillion "kernel threads" including the tasklets and deferred device initialization and so on...)

It wasn't just awkward, it was unnecessary. (And it DOES NOT SOLVE the underlying licensing issue of "this firmware is not gpl, I am bundling it into a statically linked initramfs, is this "mere aggregation", let's see what a judge has to say!

Meanwhile Bradley is in court ACTIVELY ARGUING that there's no difference between GPLv2 and GPLv3 and that the complete lack of any copyright holders willing to sign on to his increasingly extreme enforcement views isn't a problem because GPLv2 is a contract despite the complete absence of things like "privity of contract"... No really:

https://blog.tidelift.com/will-the-new-judicial-ruling-in-the-vizio-lawsuit-strengthen-the-gpl

I got dragged into this recently to spend a day telling a camera "no, Bradley's full of it", and yes he flew in to sit at the other end of the table for some reason:

https://landley.net/notes-2024.html#24-06-2024

Sigh. There's a reason I do 0BSD these days:

https://landley.net/toybox/license.html

For WIFI and bluetooth I do not see a big issue here since I'd prevent putting such features on a critical chain by system design in any way since bringing them up and (re)connecting external devices is time consuming by nature. Nothing you shall need to wait for ...

Except that reconnection mostly happens in software. The _hardware_ you're talking to stays connected. It's a resource acquisition/allocation problem sure, but closer to partition re-scanning.

*shrug* The asynchronous notifications that something happened behind your back come in through similar mechanisms, but if that's ALL we were dealing with we wouldn't have needed most of this plumbing.

(Although that was ANOTHER fun failure of the old devfs: /dev/eth0 isn't common, thanks to Bill Joy somehow not really understanding unix in 1979. And of course renaming /dev/hda to /dev/sda is a big deal from a compatibility perspective, but the <strike>devfsd2</strike> systemd guys deciding that eth0: is now potato03x1: or some such? That's just fine, who cares about compatibility with that...)

Compiling in modules vs. loading them later from user space is a trade- off. The effect of putting stuff into modules is to keep the kernel small which helps you in the "unpacking & loading kernel" phase before the kernel is actually started. Having an 1MB unpacked kernel is significantly a difference to a 5MB one.

If you can avoid ever loading the module, you may come out ahead. (Modulo why are you shipping it then, still needs storage.) Last I checked the actual module unloading was still a NOP half the time (the memory stays pinned) and marks your kernel "tainted" if you ever actually do it, which is not a vote of confidence in the codepath if you ask me.

But I had toybox insmod working years ago, the question is toybox _modprobe_ is still in pending because modprobe pulls fairly extensive shenanigans I am not personally familiar with and have to learn how to use before I can implement them, and they just seem like TERRIBLE IDEAS:

https://github.com/landley/toybox/issues/522

On the other hand, my experience is that there is lot of overhead (CPU time and IO) loading modules from user space. So it really only makes sense, if you have drivers to load at a point in time during startup where you have enough time and resources left.

The kernel boot process is already fairly heavily asynchronous, which is why your shell prompt gets buried with "link up" notifications spamming the console after it prints the $ and so on. That's why mkroot's init script does echo 3 > /proc/sys/kernel/printk before the exec handoff to whatever inherits PID 1 from the setup script:

https://github.com/landley/toybox/blob/0.8.11/mkroot/mkroot.sh#L133

Because if it's a shell, and we don't do that, you won't see the prompt under the noise.

I mean it more or less works, it's just... pointless manual maintenance of something the kernel does for you in a very small amount of code? (In devtmpfs, the /dev node being there means something. In a static /dev, it doesn't.)

I agree. There is kind of dynamic device enumeration done by the kernel drivers anyway once loaded. Any data structures to devices are build up internally. Nothing you can save ...

I spent YEARS convincing the android guys to look at devtmpfs, initramfs, container plumbing... (Keep in mind Google bought Android Inc. in 2005 and shipped the first phone at the end of 2008, meaning their main development effort predated most of this plumbing and they had to retrofit it in much later.) No idea how much impact I had and how much they would have eventually done anyway, but the main guy I was having those conversations with WAS the android base OS maintainer, so... Most recent was probably:

http://lists.landley.net/pipermail/toybox-landley.net/2022-August/029139.html

You'd think the early boot stuff was fairly straightfoward, but I keep winding up being the one to manually fix crap like:

https://lkml.iu.edu/hypermail/linux/kernel/1306.3/04204.html

And then YEARS LATER, it's me who has to:

https://lore.kernel.org/lkml/8244c75f-445e-b15b-9dbf-266e7ca666e2@xxxxxxxxxxx/

And then it had to be rewritten to remove my taint:

https://lkml.iu.edu/hypermail/linux/kernel/2311.1/01821.html
https://lkml.iu.edu/hypermail/linux/kernel/2311.2/02938.html

Let alone obvious polishing nonsense like:

https://lkml.iu.edu/hypermail/linux/kernel/1705.0/02640.html

(Which only went in because Andrew Morton picked it up despite Greg KH doing his usual stonewalling of literally anything from me. Oh well.)

Anyway, there's a reason I'm not really a kernel developer. When I try to engage with them myself, "crickets chirp" is pretty much the GOOD outcome...

https://lkml.iu.edu/hypermail/linux/kernel/1707.2/01797.html

Ahem. I'll stop now.

I'm even not sure how devtmpfs can be combined w/ your static devnodes you created in any kind of persistent partition.

You could mount your own /tmp and do mdev -s into it. That's what we used to do back around 2005:

https://lkml.iu.edu/hypermail/linux/kernel/0512.0/1326.html

(Also, when devtmpfs first went in, if you modified a node (touch, chattr, etc) then it wouldn't delete it and your management tool would have to delete it via hotplug removal event handling. So you could PIN nodes, I was just never clear on why you'd want to. It probably still does that?)

And if you even can get the kernel accepting your partition to use as /dev,

Kernel doesn't care.

you need to have it writeable for the case of dynamics you might need (usb for instance) which does not really go well with a read only RFS ... You could ... overlay fs ... well no, I think this goes into a wrong direction -> too complicated ;)

If you just have a /tmp dir in initramfs with some starting nodes initialized via the cpio extractor, and then have something like mdev add things on top of that as they're hotplugged, initramfs is inherently writeable thus the /tmp dir would be.

There's a race condition where "I booted a device with USB already plugged into it before powerup, when is the hotplug event delivered and is it before the hotplug handler is registered", which I cared deeply about in 2005 and no longer remember the details of. I could try to dig them up out of my blog and the busybox/kernel mailing lists if you care?


To summarize from my point of view:

* It's worth talking a bit about the effect of udev and about alternatives

I am not a fan of udev, for reasons that are part technical and part "oh those assholes" rant path I'm trying to avoid going down.

* "mdev" is surely worth being named as an potential option besides "selective triggering" and "static setup and moving triggers back in time"

* I wouldn't regard mknode as an real alternative in todays system

It still comes up from time to time, usually when initializing containers. (Because devtmpfs in containers does NOT give a proper container-local view of its namespace.)

Once upon a time, you could use the linux kernel's built in initramfs generation plumbing to create a cpio with arbitrary contents by providing simple text snippets to supplement their scanner, including a /dev/console entry created as a normal user (without running as root!).

But of COURSE the kernel developers removed the ability, and I patched it back in (attached), and then went "no, not fighting that fight"...

* In addition I can imagine is "modules loading" vs. "compiling in drivers" something which is worth mentioning

There's buckets of domain expertise there and I have like 1/3 of what I'd need to be confident there. (I know where to look it up, but have never considered it a good thing. Half the point of modules was to load/unload drivers for testing without reboots, and I just boot cycle a system under qemu or KVM when I can, and boot cycle a physical board when I can't because fiddling with modules really doesn't HELP my workflow. YMMV...)

The main other reason modules persist is out-of-tree drivers, usually not under GPL, which have been under systematic attack for well over a decade and the people still doing it have large teams writing shim code.

Most "let's use modules" decisions _since_ then boil down to either

1) "this is a generic PC hardware distro and I have no idea what hardware will be on there, and building every possible module into the kernel wastes a couple dozen megabytes of RAM on a system"

2) This mechanism exists, there must be a reason, therefore I should definitely use it because it's there.

(They built _mechanisms_ to prevent you from upgrading modules without upgrading the kernel they plug into. Note that the description of CONFIG_MODVERSIONS says that WITHOUT it you can't have even slight version skew. That's without MODULE_SIG and MODULE_SRCVERSION_ALL and so on.)

By the way, you can provide "module arguments" on the kernel command line, write to things like /sys/module/psmouse/parameters/rate after the driver's up...

* Once I've access to the wiki, I can try to put these ideas into an initial structure filled up w/ info we discussed in this thread

Marko

Good luck.

You know what we REALLY need a new version of? A rewrite of:

https://landley.net/kdocs/mirror/lki-single.html

With sections for each architecture. (And if you tried to write one, you'd hate Raspberry Pi as much as I do! Although https://forums.raspberrypi.com/viewtopic.php?t=357536 is extremely promising, and a far sight better than https://github.com/christinaa/rpi-open-firmware ever got to. Although I haven't really dug into the details of what's still proprietary black box spyware subtly bugging your board with "system management mode" hijacks, and what they actually managed to work around despite not having hardware documentation for broadcom chips...)

Rob
From: Rob Landley <rob@xxxxxxxxxxx>
Date: Fri, 06 Oct 2023 02:56:19 -0500
Subject: [PATCH] Add gen_initramfs.sh -O

Add a -O option to output the list instead of the archive. (You can
specify -o after -O to produce both.)

For 15 years gen_initramfs_list.sh produced a text output format that
other things consumed and modified and fed back to the kernel, then
the script changed to consume the list internally and produce the cpio
archive directly. (Why they didn't just change gen_init_cpio.c to traverse
directories itself if they were going to take away the ability to filter
the list is an open question. Maybe it could handle filenames with spaces
in them if they'd done that? And why "squash" in-band signalling instead of
the -1 I submitted, which doesn't conflict with existing users because
integers aren't valid usernames...)

Signed-off-by: Rob Landley <rob@xxxxxxxxxxx>
---

 usr/gen_initramfs.sh |   10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/usr/gen_initramfs.sh b/usr/gen_initramfs.sh
index 14b5782f961a..8f75988a5799 100755
--- a/usr/gen_initramfs.sh
+++ b/usr/gen_initramfs.sh
@@ -15,6 +15,7 @@ usage() {
 cat << EOF
 Usage:
 $0 [-o <file>] [-l <dep_list>] [-u <uid>] [-g <gid>] {-d | <cpio_source>} ...
+	-O <file>      Output annotated file list instead of archive
 	-o <file>      Create initramfs file named <file> by using gen_init_cpio
 	-l <dep_list>  Create dependency list named <dep_list>
 	-u <uid>       User ID to map to user ID 0 (root).
@@ -206,6 +207,15 @@ while [ $# -gt 0 ]; do
 			echo "deps_initramfs := \\" > $dep_list
 			shift
 			;;
+		"-O")	# Output annotated file list
+			unset output
+			trap - EXIT
+			[ "$1" = "-" ] &&
+				cpio_list="/dev/stdout" ||
+				cpio_list="$1"
+			shift
+			;;
+
 		"-o")	# generate cpio image named $1
 			output="$1"
 			shift

[Index of Archives]     [Gstreamer Embedded]     [Linux MMC Devel]     [U-Boot V2]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux ARM Kernel]     [Linux OMAP]     [Linux SCSI]

  Powered by Linux