[...] > In particular, 'cpumask_clear()' should just zero the cpumask, and on > the config I use, I have > > CONFIG_NR_CPUS=64 > > so it should literally just be a single "store zero to cpumask word". > And that's what it used to be. > > But then we had commit aa47a7c215e7 ("lib/cpumask: deprecate > nr_cpumask_bits") and suddenly 'nr_cpumask_bits' isn't a simple > constant any more for the "small mask that fits on stack" case, and > instead you end up with code like > > movl nr_cpu_ids(%rip), %edx > addq $63, %rdx > shrq $3, %rdx > andl $-8, %edx > .. > callq memset@PLT > > that does a 8-byte memset because I have 32 cores and 64 threads. Did you enable CONFIG_FORCE_NR_CPUS? If you pick it, the kernel will bind nr_cpu_ids to NR_CPUS at compile time, and the memset() call should disappear. Depending on your compiler you might want to apply this patch as well: https://lore.kernel.org/lkml/20221027043810.350460-2-yury.norov@xxxxxxxxx/ > Now, at least some distro kernels seem to be built with CONFIG_MAXSMP, > so CONFIG_NR_CPUS is something insane (namely 8192), and then it is > indeed better to calculate some minimum size instead of doing a 1kB > memset(). Ubuntu too. That was one of the reasons for the patch. Thanks, Yury