On 1/12/25 04:11, Marko Hoyer wrote:
Am 12.01.25 um 02:03 schrieb Rob Landley:
On 1/11/25 12:57, Bird, Tim wrote:
Hey Rob, This is a great review of /dev, /sys and the different
ways that /dev gets populated.
Feel free to link stuff from wikis or some such. The newest of those
documents was written in 2007.
For a lot of embedded Linux devices, the only bus where
new items can show up dynamically is USB.
SDCARD readers connected via MMC are common in automtove head units as
well ...
But do they give an insertion/removal notification that can generate an
interrupt rather than needing to be polled? (Last couple of boards I
poked at didn't, but it was cheap hardware...)
When a driver DOESN'T automatically bind to them it gets a bit
complicated, and one of the things mdev can be configured to do is act
as a firmware loader! Which is just... Ahem, there are YEARS of poor
design decisions the kernel guys made, where they ignored a mechanism
they already had an implemented something more complicated. The
mechanism whereby the kernel opens a firmware file and read it
directly out of the filesystem instead of calling a hotplug helper
was... I'm just going to gloss over that.
WIFI & Bluetooth devices often use this firmware mechanism.
The wifi and bluetooth _hardware_ is always there though. Transciever
link toggle is more or less a media insertion/removal event, which is a
slightly different hotplug mechanism.
Ogres, onions... layers.
And yes I
agree, it looks a bit ** ugly** seeing the kernel loading a firmware
file from /lib/firmware searching it in the root file system w/o
knowing the state of it during boot ...
They already HAD the hotplug helper mechanism and initramfs! You could
already CALL A LOADER and some of us had that working and DEPLOYED
before they built a whole new mechanism for "the kernel reaches out and
reads a file out of the userspace view of the filesystem from kernel
space without a process context to do it in like the ELF loader has,
don't ask me what this means for containers and namespaces..."
(Ok, they wanted to load firmware before PID 1 launched, but they were
already breaking the drivers into separate probe/init sections so you
could probe before were started and init after interrupts were started
and launching PID 1 is the first thing that happens after interrupts are
enabled (we have a scheduler now, the idle task can fork off PID 1 and
PID 0 can run pause() in a loop. Except between those two the kernel
launches a zillion "kernel threads" including the tasklets and deferred
device initialization and so on...)
It wasn't just awkward, it was unnecessary. (And it DOES NOT SOLVE the
underlying licensing issue of "this firmware is not gpl, I am bundling
it into a statically linked initramfs, is this "mere aggregation", let's
see what a judge has to say!
Meanwhile Bradley is in court ACTIVELY ARGUING that there's no
difference between GPLv2 and GPLv3 and that the complete lack of any
copyright holders willing to sign on to his increasingly extreme
enforcement views isn't a problem because GPLv2 is a contract despite
the complete absence of things like "privity of contract"... No really:
https://blog.tidelift.com/will-the-new-judicial-ruling-in-the-vizio-lawsuit-strengthen-the-gpl
I got dragged into this recently to spend a day telling a camera "no,
Bradley's full of it", and yes he flew in to sit at the other end of the
table for some reason:
https://landley.net/notes-2024.html#24-06-2024
Sigh. There's a reason I do 0BSD these days:
https://landley.net/toybox/license.html
For WIFI and bluetooth I do not
see a big issue here since I'd prevent putting such features on a
critical chain by system design in any way since bringing them up and
(re)connecting external devices is time consuming by nature. Nothing you
shall need to wait for ...
Except that reconnection mostly happens in software. The _hardware_
you're talking to stays connected. It's a resource
acquisition/allocation problem sure, but closer to partition re-scanning.
*shrug* The asynchronous notifications that something happened behind
your back come in through similar mechanisms, but if that's ALL we were
dealing with we wouldn't have needed most of this plumbing.
(Although that was ANOTHER fun failure of the old devfs: /dev/eth0 isn't
common, thanks to Bill Joy somehow not really understanding unix in
1979. And of course renaming /dev/hda to /dev/sda is a big deal from a
compatibility perspective, but the <strike>devfsd2</strike> systemd guys
deciding that eth0: is now potato03x1: or some such? That's just fine,
who cares about compatibility with that...)
Compiling in modules vs. loading them later from user space is a trade-
off. The effect of putting stuff into modules is to keep the kernel
small which helps you in the "unpacking & loading kernel" phase before
the kernel is actually started. Having an 1MB unpacked kernel is
significantly a difference to a 5MB one.
If you can avoid ever loading the module, you may come out ahead.
(Modulo why are you shipping it then, still needs storage.) Last I
checked the actual module unloading was still a NOP half the time (the
memory stays pinned) and marks your kernel "tainted" if you ever
actually do it, which is not a vote of confidence in the codepath if you
ask me.
But I had toybox insmod working years ago, the question is toybox
_modprobe_ is still in pending because modprobe pulls fairly extensive
shenanigans I am not personally familiar with and have to learn how to
use before I can implement them, and they just seem like TERRIBLE IDEAS:
https://github.com/landley/toybox/issues/522
On the other hand, my
experience is that there is lot of overhead (CPU time and IO) loading
modules from user space. So it really only makes sense, if you have
drivers to load at a point in time during startup where you have enough
time and resources left.
The kernel boot process is already fairly heavily asynchronous, which is
why your shell prompt gets buried with "link up" notifications spamming
the console after it prints the $ and so on. That's why mkroot's init
script does echo 3 > /proc/sys/kernel/printk before the exec handoff to
whatever inherits PID 1 from the setup script:
https://github.com/landley/toybox/blob/0.8.11/mkroot/mkroot.sh#L133
Because if it's a shell, and we don't do that, you won't see the prompt
under the noise.
I mean it more or less works, it's just... pointless manual
maintenance of something the kernel does for you in a very small
amount of code? (In devtmpfs, the /dev node being there means
something. In a static /dev, it doesn't.)
I agree. There is kind of dynamic device enumeration done by the kernel
drivers anyway once loaded. Any data structures to devices are build up
internally. Nothing you can save ...
I spent YEARS convincing the android guys to look at devtmpfs,
initramfs, container plumbing... (Keep in mind Google bought Android
Inc. in 2005 and shipped the first phone at the end of 2008, meaning
their main development effort predated most of this plumbing and they
had to retrofit it in much later.) No idea how much impact I had and
how much they would have eventually done anyway, but the main guy I was
having those conversations with WAS the android base OS maintainer,
so... Most recent was probably:
http://lists.landley.net/pipermail/toybox-landley.net/2022-August/029139.html
You'd think the early boot stuff was fairly straightfoward, but I keep
winding up being the one to manually fix crap like:
https://lkml.iu.edu/hypermail/linux/kernel/1306.3/04204.html
And then YEARS LATER, it's me who has to:
https://lore.kernel.org/lkml/8244c75f-445e-b15b-9dbf-266e7ca666e2@xxxxxxxxxxx/
And then it had to be rewritten to remove my taint:
https://lkml.iu.edu/hypermail/linux/kernel/2311.1/01821.html
https://lkml.iu.edu/hypermail/linux/kernel/2311.2/02938.html
Let alone obvious polishing nonsense like:
https://lkml.iu.edu/hypermail/linux/kernel/1705.0/02640.html
(Which only went in because Andrew Morton picked it up despite Greg KH
doing his usual stonewalling of literally anything from me. Oh well.)
Anyway, there's a reason I'm not really a kernel developer. When I try
to engage with them myself, "crickets chirp" is pretty much the GOOD
outcome...
https://lkml.iu.edu/hypermail/linux/kernel/1707.2/01797.html
Ahem. I'll stop now.
I'm even not sure how devtmpfs can be combined w/ your static devnodes
you created in any kind of persistent partition.
You could mount your own /tmp and do mdev -s into it. That's what we
used to do back around 2005:
https://lkml.iu.edu/hypermail/linux/kernel/0512.0/1326.html
(Also, when devtmpfs first went in, if you modified a node (touch,
chattr, etc) then it wouldn't delete it and your management tool would
have to delete it via hotplug removal event handling. So you could PIN
nodes, I was just never clear on why you'd want to. It probably still
does that?)
And if you even can get
the kernel accepting your partition to use as /dev,
Kernel doesn't care.
you need to have it
writeable for the case of dynamics you might need (usb for instance)
which does not really go well with a read only RFS ... You could ...
overlay fs ... well no, I think this goes into a wrong direction -> too
complicated ;)
If you just have a /tmp dir in initramfs with some starting nodes
initialized via the cpio extractor, and then have something like mdev
add things on top of that as they're hotplugged, initramfs is inherently
writeable thus the /tmp dir would be.
There's a race condition where "I booted a device with USB already
plugged into it before powerup, when is the hotplug event delivered and
is it before the hotplug handler is registered", which I cared deeply
about in 2005 and no longer remember the details of. I could try to dig
them up out of my blog and the busybox/kernel mailing lists if you care?
To summarize from my point of view:
* It's worth talking a bit about the effect of udev and about alternatives
I am not a fan of udev, for reasons that are part technical and part "oh
those assholes" rant path I'm trying to avoid going down.
* "mdev" is surely worth being named as an potential option besides
"selective triggering" and "static setup and moving triggers back in time"
* I wouldn't regard mknode as an real alternative in todays system
It still comes up from time to time, usually when initializing
containers. (Because devtmpfs in containers does NOT give a proper
container-local view of its namespace.)
Once upon a time, you could use the linux kernel's built in initramfs
generation plumbing to create a cpio with arbitrary contents by
providing simple text snippets to supplement their scanner, including a
/dev/console entry created as a normal user (without running as root!).
But of COURSE the kernel developers removed the ability, and I patched
it back in (attached), and then went "no, not fighting that fight"...
* In addition I can imagine is "modules loading" vs. "compiling in
drivers" something which is worth mentioning
There's buckets of domain expertise there and I have like 1/3 of what
I'd need to be confident there. (I know where to look it up, but have
never considered it a good thing. Half the point of modules was to
load/unload drivers for testing without reboots, and I just boot cycle a
system under qemu or KVM when I can, and boot cycle a physical board
when I can't because fiddling with modules really doesn't HELP my
workflow. YMMV...)
The main other reason modules persist is out-of-tree drivers, usually
not under GPL, which have been under systematic attack for well over a
decade and the people still doing it have large teams writing shim code.
Most "let's use modules" decisions _since_ then boil down to either
1) "this is a generic PC hardware distro and I have no idea what
hardware will be on there, and building every possible module into the
kernel wastes a couple dozen megabytes of RAM on a system"
2) This mechanism exists, there must be a reason, therefore I should
definitely use it because it's there.
(They built _mechanisms_ to prevent you from upgrading modules without
upgrading the kernel they plug into. Note that the description of
CONFIG_MODVERSIONS says that WITHOUT it you can't have even slight
version skew. That's without MODULE_SIG and MODULE_SRCVERSION_ALL and so
on.)
By the way, you can provide "module arguments" on the kernel command
line, write to things like /sys/module/psmouse/parameters/rate after the
driver's up...
* Once I've access to the wiki, I can try to put these ideas into an
initial structure filled up w/ info we discussed in this thread
Marko
Good luck.
You know what we REALLY need a new version of? A rewrite of:
https://landley.net/kdocs/mirror/lki-single.html
With sections for each architecture. (And if you tried to write one,
you'd hate Raspberry Pi as much as I do! Although
https://forums.raspberrypi.com/viewtopic.php?t=357536 is extremely
promising, and a far sight better than
https://github.com/christinaa/rpi-open-firmware ever got to. Although I
haven't really dug into the details of what's still proprietary black
box spyware subtly bugging your board with "system management mode"
hijacks, and what they actually managed to work around despite not
having hardware documentation for broadcom chips...)
Rob
From: Rob Landley <rob@xxxxxxxxxxx>
Date: Fri, 06 Oct 2023 02:56:19 -0500
Subject: [PATCH] Add gen_initramfs.sh -O
Add a -O option to output the list instead of the archive. (You can
specify -o after -O to produce both.)
For 15 years gen_initramfs_list.sh produced a text output format that
other things consumed and modified and fed back to the kernel, then
the script changed to consume the list internally and produce the cpio
archive directly. (Why they didn't just change gen_init_cpio.c to traverse
directories itself if they were going to take away the ability to filter
the list is an open question. Maybe it could handle filenames with spaces
in them if they'd done that? And why "squash" in-band signalling instead of
the -1 I submitted, which doesn't conflict with existing users because
integers aren't valid usernames...)
Signed-off-by: Rob Landley <rob@xxxxxxxxxxx>
---
usr/gen_initramfs.sh | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/usr/gen_initramfs.sh b/usr/gen_initramfs.sh
index 14b5782f961a..8f75988a5799 100755
--- a/usr/gen_initramfs.sh
+++ b/usr/gen_initramfs.sh
@@ -15,6 +15,7 @@ usage() {
cat << EOF
Usage:
$0 [-o <file>] [-l <dep_list>] [-u <uid>] [-g <gid>] {-d | <cpio_source>} ...
+ -O <file> Output annotated file list instead of archive
-o <file> Create initramfs file named <file> by using gen_init_cpio
-l <dep_list> Create dependency list named <dep_list>
-u <uid> User ID to map to user ID 0 (root).
@@ -206,6 +207,15 @@ while [ $# -gt 0 ]; do
echo "deps_initramfs := \\" > $dep_list
shift
;;
+ "-O") # Output annotated file list
+ unset output
+ trap - EXIT
+ [ "$1" = "-" ] &&
+ cpio_list="/dev/stdout" ||
+ cpio_list="$1"
+ shift
+ ;;
+
"-o") # generate cpio image named $1
output="$1"
shift