Re: [RFC 00/12] module: avoid userspace pressure on unwanted allocations

David Hildenbrand <david@xxxxxxxxxx> · Tue, 28 Mar 2023 23:02:49 +0200

On 28.03.23 08:16, Luis Chamberlain wrote:
On Tue, Mar 28, 2023 at 05:44:40AM +0200, David Hildenbrand wrote:
... do you have an updated patch/branch that includes the feedback from
Linus so I can give it a churn tomorrow?

Yeah sure:

https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux.git/log/?h=20230327-module-alloc-opts

I gave that one a go and get for system bootup:

#1:

13.761s tuned.service
12.261s chrony-wait.service
 7.386s NetworkManager-wait-online.service
 5.227s systemd-udev-settle.service
 2.893s initrd-switch-root.service
 2.148s polkit.service
 2.137s smartd.service
 1.893s dracut-initqueue.service
 1.290s NetworkManager.service
 1.032s cups.service

#2

13.881s tuned.service
 9.255s chrony-wait.service
 7.404s NetworkManager-wait-online.service
 5.826s systemd-udev-settle.service
 2.859s initrd-switch-root.service
 2.847s smartd.service
 2.172s polkit.service
 1.884s dracut-initqueue.service
 1.371s NetworkManager.service
 1.119s ModemManager.service

So we're a bit faster (0.2 -- 0.7s) than the original version without 
the rcu patch (~6s).

The commit log needs updateing to reflect the results I just collected:

With the alloc patch ("module: avoid allocation if module is already
present and ready") I see 145 MiB in memory difference in comparison
to its last patch, "module: extract patient module check into helper".
So I think that's a clear keeper and should help large CPU count boots.

The patch "module: add concurrency limiter" which puts the concurency
delimiter on the kread only saves about 2 MiB with 100 stress-ng ops,
which seems to be what I needed to reproduce your 400 CPU count original
issue.

The program used to reproduce is stress-ng with the new module option:

echo 0 > /proc/sys/vm/oom_dump_tasks
./stress-ng --module 100 --module-name xfs

Above command fills for me with nfs (but also ext4) the kernel log with:

...
[  883.036035] nfs: Unknown symbol xdr_reserve_space (err -2)
[  883.042221] nfs: Unknown symbol rpc_init_wait_queue (err -2)
[  883.048549] nfs: Unknown symbol put_rpccred (err -2)
[  883.054104] nfs: Unknown symbol __fscache_invalidate (err -2)
[  883.060540] nfs: Unknown symbol __fscache_use_cookie (err -2)
[  883.066969] nfs: Unknown symbol rpc_clnt_xprt_switch_has_addr (err -2)
[  883.074264] nfs: Unknown symbol __fscache_begin_write_operation (err -2)
[  883.081743] nfs: Unknown symbol nlmclnt_init (err -2)
[  883.087396] nfs: Unknown symbol nlmclnt_done (err -2)
[  883.093074] nfs: Unknown symbol nfs_debug (err -2)
[  883.098429] nfs: Unknown symbol rpc_wait_for_completion_task (err -2)
[  883.105640] nfs: Unknown symbol __fscache_acquire_cookie (err -2)
[  883.163764] nfs: Unknown symbol rpc_put_task (err -2)
[  883.169461] nfs: Unknown symbol __fscache_acquire_volume (err -2)
[  883.176297] nfs: Unknown symbol rpc_proc_register (err -2)
[  883.182430] nfs: Unknown symbol rpc_shutdown_client (err -2)
[  883.188765] nfs: Unknown symbol rpc_clnt_show_stats (err -2)
[  883.195097] nfs: Unknown symbol __fscache_begin_read_operation (err -2)
...

I do *not* get these errors on manual morprobe/rmmod. BUG in concurrent 
handling or just side-effect of the concurrent loading?

To see how much max memory I use, I just use:

free -k -s 1 -c 40 | grep Mem | awk '{print $3}' > foo.log

Run the test in another window, CTRL-C the test when above
finishes after 40 seconds and then:

sort -n -r foo.log  | head -1

[root@lenovo-sr950-01 fs]# sort -n -r foo.log  | head -1
14254024
[root@lenovo-sr950-01 fs]# sort -n -r foo.log  | tail -1
12862528

So 1391496 (KiB I assume, so 1.3 GiB !?) difference compared to before 
the test (I first start capturing and then run stress-ng).

If you have xfs loaded already you probably wanna pick module just as big
that you don't have loaded. You must have dependencies loaded already as
it doesn't call modprobe, it just finit_module's the module.

My setup already has xfs in use. nfs and ext4 are a bit smaller, but 
still big.

--
Thanks,

David / dhildenb