On 22/03/23 11:30, Peter Zijlstra wrote: > On Wed, Mar 22, 2023 at 10:39:55AM +0100, Peter Zijlstra wrote: >> On Tue, Mar 07, 2023 at 02:35:52PM +0000, Valentin Schneider wrote: >> > +TRACE_EVENT(ipi_send_cpumask, >> > + >> > + TP_PROTO(const struct cpumask *cpumask, unsigned long callsite, void *callback), >> > + >> > + TP_ARGS(cpumask, callsite, callback), >> > + >> > + TP_STRUCT__entry( >> > + __cpumask(cpumask) >> > + __field(void *, callsite) >> > + __field(void *, callback) >> > + ), >> > + >> > + TP_fast_assign( >> > + __assign_cpumask(cpumask, cpumask_bits(cpumask)); >> > + __entry->callsite = (void *)callsite; >> > + __entry->callback = callback; >> > + ), >> > + >> > + TP_printk("cpumask=%s callsite=%pS callback=%pS", >> > + __get_cpumask(cpumask), __entry->callsite, __entry->callback) >> > +); >> >> Would it make sense to add a variant like: ipi_send_cpu() that records a >> single cpu instead of a cpumask. A lot of sites seems to do: >> cpumask_of(cpu) for that first argument, and it seems to me it is quite >> daft to have to memcpy a full multi-word cpumask in those cases. >> >> Remember, nr_possible_cpus > 64 is quite common these days. > > Something we litte bit like so... > I was wondering whether we could stick with a single trace event, but let ftrace be aware of weight=1 vs weight>1 cpumasks. For weight>1, it would memcpy() as usual, for weight=1, it could write a pointer to a cpu_bit_bitmap[] equivalent embedded in the trace itself. Unfortunately, Ftrace bitmasks are represented as a u32 made of two 16 bit values: [offset in event record, size], so there isn't a straightforward way to point to a "reusable" cpumask. AFAICT the only alternative would be to do that via a different trace event, but then we should just go with a plain old uint - i.e. do what you're doing here, so: Tested-and-reviewed-by: Valentin Schneider <vschneid@xxxxxxxxxx> (with the tiny typo fix below) > @@ -35,6 +35,28 @@ TRACE_EVENT(ipi_raise, > TP_printk("target_mask=%s (%s)", __get_bitmask(target_cpus), __entry->reason) > ); > > +TRACE_EVENT(ipi_send_cpu, > + > + TP_PROTO(const unsigned int cpu, unsigned long callsite, void *callback), > + > + TP_ARGS(cpu, callsite, callback), > + > + TP_STRUCT__entry( > + __field(unsigned int, cpu) > + __field(void *, callsite) > + __field(void *, callback) > + ), > + > + TP_fast_assign( > + __entry->cpu = cpu; > + __entry->callsite = (void *)callsite; > + __entry->callback = callback; > + ), > + > + TP_printk("cpu=%s callsite=%pS callback=%pS", ^ s/s/u/ > + __entry->cpu, __entry->callsite, __entry->callback) > +); > +