Re: [PATCH] mm, slub: prefetch freelist in ___slab_alloc()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Aug 19, 2024 at 4:02 PM Yongqiang Liu <liuyongqiang13@xxxxxxxxxx> wrote:
>
> commit 0ad9500e16fe ("slub: prefetch next freelist pointer in
> slab_alloc()") introduced prefetch_freepointer() for fastpath
> allocation. Use it at the freelist firt load could have a bit
> improvement in some workloads. Here is hackbench results at
> arm64 machine(about 3.8%):
>
> Before:
>   average time cost of 'hackbench -g 100 -l 1000': 17.068
>
> Afther:
>   average time cost of 'hackbench -g 100 -l 1000': 16.416
>
> There is also having about 5% improvement at x86_64 machine
> for hackbench.

I think adding more prefetch might not be a good idea unless we have
more real-world data supporting it because prefetch might help when slab
is frequently used, but it will end up unnecessarily using more cache
lines when slab is not frequently used.

Also I don't understand how adding prefetch in slowpath affects the performance
because most allocs/frees should be done in the fastpath. Could you
please explain?

> Signed-off-by: Yongqiang Liu <liuyongqiang13@xxxxxxxxxx>
> ---
>  mm/slub.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/mm/slub.c b/mm/slub.c
> index c9d8a2497fd6..f9daaff10c6a 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -3630,6 +3630,7 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node,
>         VM_BUG_ON(!c->slab->frozen);
>         c->freelist = get_freepointer(s, freelist);
>         c->tid = next_tid(c->tid);
> +       prefetch_freepointer(s, c->freelist);
>         local_unlock_irqrestore(&s->cpu_slab->lock, flags);
>         return freelist;
>
> --
> 2.25.1
>





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux