On Mon, Aug 05, 2024 at 01:28:03PM -0700, Andrii Nakryiko wrote: > trace_uprobe->nhit counter is not incremented atomically, so its value > is bogus in practice. On the other hand, it's actually a pretty big > uprobe scalability problem due to heavy cache line bouncing between CPUs > triggering the same uprobe. so you're seeing that in the benchmark, right? I'm curious how bad the numbers are > > Drop it and emit obviously unrealistic value in its stead in > uporbe_profiler seq file. > > The alternative would be allocating per-CPU counter, but I'm not sure > it's justified. > > Signed-off-by: Andrii Nakryiko <andrii@xxxxxxxxxx> > --- > kernel/trace/trace_uprobe.c | 4 +--- > 1 file changed, 1 insertion(+), 3 deletions(-) > > diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c > index 52e76a73fa7c..5d38207db479 100644 > --- a/kernel/trace/trace_uprobe.c > +++ b/kernel/trace/trace_uprobe.c > @@ -62,7 +62,6 @@ struct trace_uprobe { > struct uprobe *uprobe; > unsigned long offset; > unsigned long ref_ctr_offset; > - unsigned long nhit; > struct trace_probe tp; > }; > > @@ -821,7 +820,7 @@ static int probes_profile_seq_show(struct seq_file *m, void *v) > > tu = to_trace_uprobe(ev); > seq_printf(m, " %s %-44s %15lu\n", tu->filename, > - trace_probe_name(&tu->tp), tu->nhit); > + trace_probe_name(&tu->tp), ULONG_MAX); seems harsh.. would it be that bad to create per cpu counter for that? jirka > return 0; > } > > @@ -1507,7 +1506,6 @@ static int uprobe_dispatcher(struct uprobe_consumer *con, struct pt_regs *regs) > int ret = 0; > > tu = container_of(con, struct trace_uprobe, consumer); > - tu->nhit++; > > udd.tu = tu; > udd.bp_addr = instruction_pointer(regs); > -- > 2.43.5 >