On Thu, May 25, 2023 at 11:08 AM Luis Chamberlain <mcgrof@xxxxxxxxxx> wrote: > > Certainly on the track where I wish we could go. Now this goes tested. > On 255 cores: > > Before: > > vagrant@kmod ~ $ sudo systemd-analyze > Startup finished in 41.653s (kernel) + 44.305s (userspace) = 1min 25.958s > graphical.target reached after 44.178s in userspace. > > root@kmod ~ # grep "Virtual mem wasted bytes" /sys/kernel/debug/modules/stats > Virtual mem wasted bytes 1949006968 > > > ; 1949006968/1024/1024/1024 > ~1.81515418738126754761 > > So ~1.8 GiB... of vmalloc space wasted during boot. > > After: > > systemd-analyze > Startup finished in 24.438s (kernel) + 41.278s (userspace) = 1min 5.717s > graphical.target reached after 41.154s in userspace. > > root@kmod ~ # grep "Virtual mem wasted bytes" /sys/kernel/debug/modules/stats > Virtual mem wasted bytes 354413398 > > So still 337.99 MiB of vmalloc space wasted during boot due to > duplicates. Ok. I think this will count as 'good enough for mitigation purposes' > The reason is the exclusive_deny_write_access() must be > kept during the life of the module otherwise as soon as it is done > others can still race to load Yes. The exclusion only applies while the file is actively being read. > So with two other hunks added (2nd and 4th), this now matches parity with > my patch, not suggesting this is right, Yeah, we can't do that, because user space may quite validly want to write the file afterwards. Or, in fact, unload the module and re-load it. So the "exclusion" really needs to be purely temporary. That said, I considered moving the exclusion to module/main.c itself, rather than the reading part. That wouild get rid of the hacky "id == READING_MODULE", and put the exclusion in the place that actually wants it. And that would allow us to at least extend that temporary exlusion a bit - we could keep it until the module has actually been loaded and inited. So it would probably improve on those numbers a bit more, but you'd still have the fundamental race where *serial* duplicates end up always wasting CPU effort and temporary vmalloc space. Linus