On Tue, Jan 14, 2025 at 5:27 PM David Wang <00107082@xxxxxxx> wrote: > > > At 2025-01-15 02:48:13, "Suren Baghdasaryan" <surenb@xxxxxxxxxx> wrote: > >On Mon, Jan 13, 2025 at 7:36 PM David Wang <00107082@xxxxxxx> wrote: > >> > > >> >> I have my accumulative counter patch and filter out items with 0 accumulative counter, > >> >> I am almost sure the patch would not cause this accounting issue, but not 100%..... > >> > > >> >Have you tested this without your accumulative counter patch? > >> >IIUC, that patch filters out any allocation which has never been hit. > >> >So, if suspend/resume path contains allocations which were never hit > >> >before then those allocations would become suddenly visible, like in > >> >your case. That's why I'm against filtering allocinfo data in the > >> >kernel. Please try this without your patch and see if the data becomes > >> >more consistent. > >> > >> I remove all my patch and build a 6.13.0-rc7 kernel, > >> After boot up, > >> 64 1 kernel/sched/topology.c:2579 func:alloc_sched_domains > >> 896 14 kernel/sched/topology.c:2275 func:__sdt_alloc > >> 896 14 kernel/sched/topology.c:2266 func:__sdt_alloc > >> 96 6 kernel/sched/topology.c:2259 func:__sdt_alloc > >> 12288 24 kernel/sched/topology.c:2252 func:__sdt_alloc > >> 0 0 kernel/sched/topology.c:2242 func:__sdt_alloc > >> 0 0 kernel/sched/topology.c:2238 func:__sdt_alloc > >> 0 0 kernel/sched/topology.c:2234 func:__sdt_alloc > >> 0 0 kernel/sched/topology.c:2230 func:__sdt_alloc > >> 512 1 kernel/sched/topology.c:1961 func:sched_init_numa > >> > >> And after suspend/resume, no change detected: > >> 64 1 kernel/sched/topology.c:2579 func:alloc_sched_domains > >> 896 14 kernel/sched/topology.c:2275 func:__sdt_alloc > >> 896 14 kernel/sched/topology.c:2266 func:__sdt_alloc > >> 96 6 kernel/sched/topology.c:2259 func:__sdt_alloc > >> 12288 24 kernel/sched/topology.c:2252 func:__sdt_alloc > >> 0 0 kernel/sched/topology.c:2242 func:__sdt_alloc > >> 0 0 kernel/sched/topology.c:2238 func:__sdt_alloc > >> 0 0 kernel/sched/topology.c:2234 func:__sdt_alloc > >> 0 0 kernel/sched/topology.c:2230 func:__sdt_alloc > >> 512 1 kernel/sched/topology.c:1961 func:sched_init_numa > >> > >> I also build a image with accumulative counter, but no filter. > >> > >> After boot up: > >> 64 1 kernel/sched/topology.c:2579 func:alloc_sched_domains 2 > >> 896 14 kernel/sched/topology.c:2275 func:__sdt_alloc 80 > >> 896 14 kernel/sched/topology.c:2266 func:__sdt_alloc 80 > >> 96 6 kernel/sched/topology.c:2259 func:__sdt_alloc 80 > >> 12288 24 kernel/sched/topology.c:2252 func:__sdt_alloc 80 > >> 0 0 kernel/sched/topology.c:2242 func:__sdt_alloc 0 <---this *0* seems wrong > >> 0 0 kernel/sched/topology.c:2238 func:__sdt_alloc 0 > >> 0 0 kernel/sched/topology.c:2234 func:__sdt_alloc 0 > >> 0 0 kernel/sched/topology.c:2230 func:__sdt_alloc 0 > >> 512 1 kernel/sched/topology.c:1961 func:sched_init_numa 1 > >> > >> And then suspend/resume: > >> 64 1 kernel/sched/topology.c:2579 func:alloc_sched_domains 17 > >> 896 14 kernel/sched/topology.c:2275 func:__sdt_alloc 395 > >> 896 14 kernel/sched/topology.c:2266 func:__sdt_alloc 395 > >> 96 6 kernel/sched/topology.c:2259 func:__sdt_alloc 395 > >> 12288 24 kernel/sched/topology.c:2252 func:__sdt_alloc 395 > >> 0 0 kernel/sched/topology.c:2242 func:__sdt_alloc 70 > >> 0 0 kernel/sched/topology.c:2238 func:__sdt_alloc 70 > >> 0 0 kernel/sched/topology.c:2234 func:__sdt_alloc 70 > >> 0 0 kernel/sched/topology.c:2230 func:__sdt_alloc 70 > >> 512 1 kernel/sched/topology.c:1961 func:sched_init_numa 1> > >> Reading the code, those allocation behaviors should be tied together: > >> if kzalloc_node at line#2252 happened, then alloc_percpu at line#2230 should also happened. > > > >Hmm, ok. Looks like early calls to alloc_percpu() are not being > >registered somehow. Could you please share your cumulative counter > >patch with me? I'll try to reproduce this locally and see if I can > > >spot the issue. > > Sure, here is the patch base on 6.13.0-rc7. Thanks and sorry for the delay. It looks like the per-cpu allocations you pointed out happen early enough in the boot process that chunk->obj_exts was not allocated yet. Therefore the check inside pcpu_alloc_tag_alloc_hook() for chunk->obj_exts fails and accounting gets skipped. Allocating obj_exts earlier is not trivial because slab is not available yet. I'll need to look closer into per-cpu code to see how this can be fixed. > > > > > diff --git a/include/linux/alloc_tag.h b/include/linux/alloc_tag.h > index 0bbbe537c5f9..6ca680604c6d 100644 > --- a/include/linux/alloc_tag.h > +++ b/include/linux/alloc_tag.h > @@ -18,6 +18,7 @@ > struct alloc_tag_counters { > u64 bytes; > u64 calls; > + u64 accu_calls; > }; > > /* > @@ -124,7 +125,7 @@ static inline bool mem_alloc_profiling_enabled(void) > > static inline struct alloc_tag_counters alloc_tag_read(struct alloc_tag *tag) > { > - struct alloc_tag_counters v = { 0, 0 }; > + struct alloc_tag_counters v = { 0, 0, 0 }; > struct alloc_tag_counters *counter; > int cpu; > > @@ -132,6 +133,7 @@ static inline struct alloc_tag_counters alloc_tag_read(struct alloc_tag *tag) > counter = per_cpu_ptr(tag->counters, cpu); > v.bytes += counter->bytes; > v.calls += counter->calls; > + v.accu_calls += counter->accu_calls; > } > > return v; > @@ -179,6 +181,7 @@ static inline bool alloc_tag_ref_set(union codetag_ref *ref, struct alloc_tag *t > * counter because when we free each part the counter will be decremented. > */ > this_cpu_inc(tag->counters->calls); > + this_cpu_inc(tag->counters->accu_calls); > return true; > } > > diff --git a/lib/alloc_tag.c b/lib/alloc_tag.c > index 7dcebf118a3e..615833d4fbd7 100644 > --- a/lib/alloc_tag.c > +++ b/lib/alloc_tag.c > @@ -97,6 +97,7 @@ static void alloc_tag_to_text(struct seq_buf *out, struct codetag *ct) > > seq_buf_printf(out, "%12lli %8llu ", bytes, counter.calls); > codetag_to_text(out, ct); > + seq_buf_printf(out, " %llu", counter.accu_calls); > seq_buf_putc(out, ' '); > seq_buf_putc(out, '\n'); > } > > > > David > > > > >> > >> kernel/sched/topology.c > >> 2230 sdd->sd = alloc_percpu(struct sched_domain *); > >> 2231 if (!sdd->sd) > >> 2232 return -ENOMEM; > >> ... > >> 2246 for_each_cpu(j, cpu_map) { > >> ... > >> 2252 sd = kzalloc_node(sizeof(struct sched_domain) + cpumask_size(), > >> 2253 GFP_KERNEL, cpu_to_node(j)); > >> ... > >> 2257 *per_cpu_ptr(sdd->sd, j) = sd; > >> > >> > >> But somehow during bootup, those alloc_percpu in kernel/sched/topology.c:__sdt_alloc were missed in profiling. > >> (I am not meant to sell the idea of accumulative counter again here, but it dose help sometimes. :). > >> > >> >Thanks, > >> >Suren. > >> > > >> > > >> >> > >> > >> Thanks > >> David