Re: [PATCH v3 0/8] make slab shrink lockless

Roman Gushchin <roman.gushchin@xxxxxxxxx> · Mon, 27 Feb 2023 11:02:23 -0800

On Mon, Feb 27, 2023 at 09:31:51PM +0800, Qi Zheng wrote:
> 
> 
> On 2023/2/27 03:51, Andrew Morton wrote:
> > On Sun, 26 Feb 2023 22:46:47 +0800 Qi Zheng <zhengqi.arch@xxxxxxxxxxxxx> wrote:
> > 
> Save the above script, then run test and touch commands.
> 
> Then we can use the following perf command to view hotspots:
> 
> perf top -U -F 999
> 
> 1) Before applying this patchset:
> 
>   32.31%  [kernel]           [k] down_read_trylock
>   19.40%  [kernel]           [k] pv_native_safe_halt
>   16.24%  [kernel]           [k] up_read
>   15.70%  [kernel]           [k] shrink_slab
>    4.69%  [kernel]           [k] _find_next_bit
>    2.62%  [kernel]           [k] shrink_node
>    1.78%  [kernel]           [k] shrink_lruvec
>    0.76%  [kernel]           [k] do_shrink_slab
> 
> 2) After applying this patchset:
> 
>   27.83%  [kernel]           [k] _find_next_bit
>   16.97%  [kernel]           [k] shrink_slab
>   15.82%  [kernel]           [k] pv_native_safe_halt
>    9.58%  [kernel]           [k] shrink_node
>    8.31%  [kernel]           [k] shrink_lruvec
>    5.64%  [kernel]           [k] do_shrink_slab
>    3.88%  [kernel]           [k] mem_cgroup_iter

Not opposing the intention of the patchset in any way (I actually think
it's a good idea to make the shrinkers list lockless), but looking at
both outputs above I think that the main problem is not the contention on
the semaphore, but the reason of this contention.

It seems like often there is a long list of shrinkers which barely
can reclaim any memory, but we're calling them again and again.
In order to achieve real wins with real-life workloads, I guess
it's what we should optimize.

Thanks!