Re: [PATCH 2/3] alloc_tag: uninline code gated by mem_alloc_profiling_key in slab allocator

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 27 Jan 2025 11:38:32 -0800
Suren Baghdasaryan <surenb@xxxxxxxxxx> wrote:

> On Sun, Jan 26, 2025 at 8:47 AM Vlastimil Babka <vbabka@xxxxxxx> wrote:
> >
> > On 1/26/25 08:02, Suren Baghdasaryan wrote:  
> > > When a sizable code section is protected by a disabled static key, that
> > > code gets into the instruction cache even though it's not executed and
> > > consumes the cache, increasing cache misses. This can be remedied by
> > > moving such code into a separate uninlined function. The improvement  
> 
> Sorry, I missed adding Steven Rostedt into the CC list since his
> advice was instrumental in finding the way to optimize the static key
> performance in this patch. Added now.
> 
> >
> > Weird, I thought the static_branch_likely/unlikely/maybe was already
> > handling this by the unlikely case being a jump to a block away from the
> > fast-path stream of instructions, thus making it less likely to get cached.
> > AFAIU even plain likely()/unlikely() should do this, along with branch
> > prediction hints.  
> 
> This was indeed an unexpected overhead when I measured it on Android.
> Cache pollution was my understanding of the cause for this high
> overhead after Steven told me to try uninlining the protected code. He
> has done something similar in the tracing subsystem. But maybe I
> misunderstood the real reason. Steven, could you please verify if my
> understanding of the high overhead cause is correct here? Maybe there
> is something else at play that I missed?


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux