On 10/18/22 21:53, Prarit Bhargava wrote: > Quoting from the original thread, > >> >> Motivation for this patch is to fix an issue observed on larger machines with >> many CPUs where it can take a significant amount of time during boot to run >> systemd-udev-trigger.service. An x86-64 system can have already intel_pstate >> active but as its CPUs can match also acpi_cpufreq and pcc_cpufreq, udev will >> attempt to load these modules too. The operation will eventually fail in the >> init function of a respective module where it gets recognized that another >> cpufreq driver is already loaded and -EEXIST is returned. However, one uevent >> is triggered for each CPU and so multiple loads of these modules will be >> present. The current code then processes all such loads individually and >> serializes them with the barrier in add_unformed_module(). >> > > The way to solve this is not in the module loading code, but in the udev > code by adding a new event or in the userspace which handles the loading > events. > > Option 1) > > Write/modify a udev rule to to use a flock userspace file lock to > prevent repeated loading. The problem with this is that it is still > racy and still consumes CPU time repeated load the ELF header and, > depending on the system (ie a large number of cpus) would still cause a > boot delay. This would be better than what we have and is worth looking > at as a simple solution. I'd like to see boot times with this change, > and I'll try to come up with a measurement on a large CPU system. It is not immediately clear to me how this can be done as a udev rule. You mention that you'll try to test this on a large CPU system. Does it mean that you have a prototype implemented already? If yes, could you please share it? My reading is that one would need to update the "MODALIAS" rule in 80-drivers.rules [1] to do this locking. However, that just collects 'kmod load' (builtin) for udev to execute after all rules are processed. It would then be required to synchronize udev workers to prevent repeated loading? > Option 2) > > Create a new udev action, "add_once" to indicate to userspace that the > module only needs to be loaded one time, and to ignore further load > requests. This is a bit tricky as both kernel space and userspace would > have be modified. The udev rule would end up looking very similar to > what we now. > > The benefit of option 2 is that driver writers themselves can choose > which drivers should issue "add_once" instead of add. Drivers that are > known to run on all devices at once would call "add_once" to only issue > a single load. On the device event side, I more wonder if it would be possible to avoid tying up cpufreq and edac modules to individual CPU devices. Maybe their loading could be attached to some platform device, even if it means introducing an auxiliary device for this purpose? I need to look a bit more into this idea. [1] https://github.com/systemd/systemd/blob/4856f63846fc794711e1b8ec970e4c56494cd320/rules.d/80-drivers.rules Thanks, Petr