On 10/26/22 11:48, Catalin Marinas wrote: >> > diff --git a/lib/kobject.c b/lib/kobject.c >> > index a0b2dbfcfa23..2c4acb36925d 100644 >> > --- a/lib/kobject.c >> > +++ b/lib/kobject.c >> > @@ -144,7 +144,7 @@ char *kobject_get_path(struct kobject *kobj, gfp_t gfp_mask) >> > len = get_kobj_path_length(kobj); >> > if (len == 0) >> > return NULL; >> > - path = kzalloc(len, gfp_mask); >> > + path = kzalloc(len, gfp_mask | __GFP_PACKED); >> >> This might not be small, and it's going to be very very short-lived >> (within a single function call), why does it need to be allocated this >> way? > > Regarding short-lived objects, you are right, they won't affect > slabinfo. My ftrace-fu is not great, I only looked at the allocation > hits and they keep adding up without counting how many are > freed. So maybe we need tracing free() as well but not always easy to > match against the allocation point and infer how many live objects there > are. BTW, since 6.1-rc1 we have a new way with slub_debug to determine how much memory is wasted, thanks to commit 6edf2576a6cc ("mm/slub: enable debugging memory wasting of kmalloc") by Feng Tang. You need to boot the kernel with parameter such as: slub_debug=U,kmalloc-64,kmalloc-128,kmalloc-192,kmalloc-256 (or just slub_debug=U,kmalloc-* for all sizes, but I guess you are interested mainly in those that are affected by DMA alignment) Note it does have some alloc/free CPU overhead and memory overhead, so not intended for normal production. Then you can check e.g. cat /sys/kernel/debug/slab/kmalloc-128/alloc_traces | head -n 50 77 set_kthread_struct+0x60/0x100 waste=1232/16 age=19492/31067/32465 pid=2 cpus=0-3 __kmem_cache_alloc_node+0x102/0x340 kmalloc_trace+0x26/0xa0 set_kthread_struct+0x60/0x100 copy_process+0x1903/0x2ee0 kernel_clone+0xf4/0x4f0 kernel_thread+0xae/0xe0 kthreadd+0x491/0x500 ret_from_fork+0x22/0x30 which tells you there are currently 77 live allocations with this exact stack trace. The new information in 6.1 is the "waste=1232/16" which means these allocations waste 16 bytes each due to rounding up to the kmalloc cache size, or 1232 bytes in total (16*77). This should help finding the prominent sources of waste.