On Fri, Mar 3, 2023 at 12:39 PM Mateusz Guzik <mjguzik@xxxxxxxxx> wrote: > > as an example here is a one-liner to show crappers which do 0-sized ops: > bpftrace -e 'kprobe:memset,kprobe:memcpy /arg2 == 0/ { @[probe, > kstack(2)] = count(); }' Looking not at 0-sized ops, but crazy small memset() ops, at least some of these seem to have grown from the bitmap "optimizations". In particular, 'cpumask_clear()' should just zero the cpumask, and on the config I use, I have CONFIG_NR_CPUS=64 so it should literally just be a single "store zero to cpumask word". And that's what it used to be. But then we had commit aa47a7c215e7 ("lib/cpumask: deprecate nr_cpumask_bits") and suddenly 'nr_cpumask_bits' isn't a simple constant any more for the "small mask that fits on stack" case, and instead you end up with code like movl nr_cpu_ids(%rip), %edx addq $63, %rdx shrq $3, %rdx andl $-8, %edx .. callq memset@PLT that does a 8-byte memset because I have 32 cores and 64 threads. Now, at least some distro kernels seem to be built with CONFIG_MAXSMP, so CONFIG_NR_CPUS is something insane (namely 8192), and then it is indeed better to calculate some minimum size instead of doing a 1kB memset(). But yes, it does look like we've had several regressions in various areas, where we do insane "memset()" calls with variable size that should never have been that. Both with kzalloc() and with cpumasks. There are probably lots of other cases, I just happened to notice the cpumask one when looking at "why is there a memset call in a function that shouldn't have any at all". I don't think this is a "rep stos" problem in general. Obviously it might be in some cases, but I do think that the deeper problem is that those small memset() calls generally should not have existed at all, and were actual bugs. That cpumask_clear() should never have been a function call in the first place for the "clear one or a couple of words" case. The kernel very seldom acts on byte data, so a memset() or memcpy() that looks like individual bytes is some very odd stuff. We have lots of often fairly small structures with fixed sizes, though. And maybe there are other cases like that bogus "let's try to optimize the bitmap range"... Of course, then we *do* have distros that do actively detrimental things. You have Ubuntu with that CONFIG_INIT_ON_ALLOC_DEFAULT_ON, and if I didn't just rebuild my kernel with sane config options, I'd have Fedora with CONFIG_MAXSMP that is meant to be a "check worst case" option, not something you *use*. Oh well. Linus