On Thu, Nov 03, 2022 at 05:15:51PM +0100, Vlastimil Babka wrote: > On 10/26/22 11:48, Catalin Marinas wrote: > >> > diff --git a/lib/kobject.c b/lib/kobject.c > >> > index a0b2dbfcfa23..2c4acb36925d 100644 > >> > --- a/lib/kobject.c > >> > +++ b/lib/kobject.c > >> > @@ -144,7 +144,7 @@ char *kobject_get_path(struct kobject *kobj, gfp_t gfp_mask) > >> > len = get_kobj_path_length(kobj); > >> > if (len == 0) > >> > return NULL; > >> > - path = kzalloc(len, gfp_mask); > >> > + path = kzalloc(len, gfp_mask | __GFP_PACKED); > >> > >> This might not be small, and it's going to be very very short-lived > >> (within a single function call), why does it need to be allocated this > >> way? > > > > Regarding short-lived objects, you are right, they won't affect > > slabinfo. My ftrace-fu is not great, I only looked at the allocation > > hits and they keep adding up without counting how many are > > freed. So maybe we need tracing free() as well but not always easy to > > match against the allocation point and infer how many live objects there > > are. > > BTW, since 6.1-rc1 we have a new way with slub_debug to determine how much > memory is wasted, thanks to commit 6edf2576a6cc ("mm/slub: enable debugging > memory wasting of kmalloc") by Feng Tang. > > You need to boot the kernel with parameter such as: > slub_debug=U,kmalloc-64,kmalloc-128,kmalloc-192,kmalloc-256 > (or just slub_debug=U,kmalloc-* for all sizes, but I guess you are > interested mainly in those that are affected by DMA alignment) > Note it does have some alloc/free CPU overhead and memory overhead, so not > intended for normal production. > > Then you can check e.g. > cat /sys/kernel/debug/slab/kmalloc-128/alloc_traces | head -n 50 > 77 set_kthread_struct+0x60/0x100 waste=1232/16 age=19492/31067/32465 pid=2 cpus=0-3 > __kmem_cache_alloc_node+0x102/0x340 > kmalloc_trace+0x26/0xa0 > set_kthread_struct+0x60/0x100 > copy_process+0x1903/0x2ee0 > kernel_clone+0xf4/0x4f0 > kernel_thread+0xae/0xe0 > kthreadd+0x491/0x500 > ret_from_fork+0x22/0x30 > > which tells you there are currently 77 live allocations with this exact > stack trace. The new information in 6.1 is the "waste=1232/16" which > means these allocations waste 16 bytes each due to rounding up to the > kmalloc cache size, or 1232 bytes in total (16*77). This should help > finding the prominent sources of waste. Thanks. That's a lot more useful than ftrace for this scenario. At a quick test in a VM, the above reports about 1200 cases but there are only around 100 unique allocation places (e.g. kstrdup called from several places with different sizes). So not too bad if we are to go with a GFP_ flag. -- Catalin