On Thu, 19 May 2022 14:35:46 +0300 Vasily Averin <vvs@xxxxxxxxxx> wrote: > >> @@ -33,42 +35,46 @@ DECLARE_EVENT_CLASS(kmem_alloc, > >> __entry->bytes_req = bytes_req; > >> __entry->bytes_alloc = bytes_alloc; > >> __entry->gfp_flags = (__force unsigned long)gfp_flags; > >> + __entry->accounted = (gfp_flags & __GFP_ACCOUNT) || > >> + (s && s->flags & SLAB_ACCOUNT); > > > > Now you could make this even faster in the fast path and save just the > > s->flags. > > > > __entry->sflags = s ? s->flags : 0; > > > >> ), > >> > >> - TP_printk("call_site=%pS ptr=%p bytes_req=%zu bytes_alloc=%zu gfp_flags=%s", > >> + TP_printk("call_site=%pS ptr=%p bytes_req=%zu bytes_alloc=%zu gfp_flags=%s accounted=%s", > >> (void *)__entry->call_site, > >> __entry->ptr, > >> __entry->bytes_req, > >> __entry->bytes_alloc, > >> - show_gfp_flags(__entry->gfp_flags)) > >> + show_gfp_flags(__entry->gfp_flags), > >> + __entry->accounted ? "true" : "false") > > > > And then have: "accounted=%s": > > > > (__entry->gfp_flags & __GFP_ACCOUNT) || > > (__entry->sflags & SLAB_ACCOUNT) ? "true" : "false" > > Unfortunately this returns back sparse warnings about bitwise gfp_t and slab_flags_t casts. > Could you please explain why your variant is faster? Micro-optimization, grant you, but it is faster because it moves some of the logic into the slow path (the read side), and takes it out of the fast path (the write side). The idea of tracing is to squeeze out every cycle we can to keep the tracing overhead down. But it's really up to you if you need that. I'm not going to let this be a blocker. This is more of an FYI than anything else. -- Steve