Re: [PATCH v3 0/8] make slab shrink lockless

Qi Zheng <zhengqi.arch@xxxxxxxxxxxxx> · Tue, 28 Feb 2023 18:11:34 +0800

On 2023/2/28 03:02, Roman Gushchin wrote:
On Mon, Feb 27, 2023 at 09:31:51PM +0800, Qi Zheng wrote:

On 2023/2/27 03:51, Andrew Morton wrote:
On Sun, 26 Feb 2023 22:46:47 +0800 Qi Zheng <zhengqi.arch@xxxxxxxxxxxxx> wrote:

Save the above script, then run test and touch commands.

Then we can use the following perf command to view hotspots:

perf top -U -F 999

1) Before applying this patchset:

   32.31%  [kernel]           [k] down_read_trylock
   19.40%  [kernel]           [k] pv_native_safe_halt
   16.24%  [kernel]           [k] up_read
   15.70%  [kernel]           [k] shrink_slab
    4.69%  [kernel]           [k] _find_next_bit
    2.62%  [kernel]           [k] shrink_node
    1.78%  [kernel]           [k] shrink_lruvec
    0.76%  [kernel]           [k] do_shrink_slab

2) After applying this patchset:

   27.83%  [kernel]           [k] _find_next_bit
   16.97%  [kernel]           [k] shrink_slab
   15.82%  [kernel]           [k] pv_native_safe_halt
    9.58%  [kernel]           [k] shrink_node
    8.31%  [kernel]           [k] shrink_lruvec
    5.64%  [kernel]           [k] do_shrink_slab
    3.88%  [kernel]           [k] mem_cgroup_iter

Not opposing the intention of the patchset in any way (I actually think
it's a good idea to make the shrinkers list lockless), but looking at
both outputs above I think that the main problem is not the contention on
the semaphore, but the reason of this contention.

Yes, in the above scenario, there is indeed no lock contention problem.

It seems like often there is a long list of shrinkers which barely
can reclaim any memory, but we're calling them again and again.
In order to achieve real wins with real-life workloads, I guess
it's what we should optimize.

Thanks!

--
Thanks,
Qi