Re: [PATCH RFC v2 00/10] SLUB percpu sheaves

Vlastimil Babka <vbabka@xxxxxxx> · Tue, 4 Mar 2025 11:54:58 +0100

On 2/25/25 21:26, Suren Baghdasaryan wrote:
> On Mon, Feb 24, 2025 at 1:12 PM Suren Baghdasaryan <surenb@xxxxxxxxxx> wrote:
>>
>> >
>> > > The values represent the total time it took to perform mmap syscalls, less is
>> > > better.
>> > >
>> > > (1)                  baseline       control
>> > > Little core       7.58327       6.614939 (-12.77%)
>> > > Medium core  2.125315     1.428702 (-32.78%)
>> > > Big core          0.514673     0.422948 (-17.82%)
>> > >
>> > > (2)                  baseline      control
>> > > Little core       7.58327       5.141478 (-32.20%)
>> > > Medium core  2.125315     0.427692 (-79.88%)
>> > > Big core          0.514673    0.046642 (-90.94%)
>> > >
>> > > (3)                   baseline      control
>> > > Little core        7.58327      4.779624 (-36.97%)
>> > > Medium core   2.125315    0.450368 (-78.81%)
>> > > Big core           0.514673    0.037776 (-92.66%)
> 
> (4)                   baseline      control
> Little core        7.58327      4.642977 (-38.77%)
> Medium core   2.125315    0.373692 (-82.42%)
> Big core           0.514673    0.043613 (-91.53%)
> 
> I think the difference between (3) and (4) is noise.
> Thanks,
> Suren.

Hi, as we discussed yesterday, it would be useful to set the baseline to
include everything before sheaves as that's already on the way to 6.15, so
we can see more clearly what sheaves do relative to that. So at this point
it's the vma lock conversion including TYPESAFE_BY_RCU (that's not undone,
thus like in scenario (4)), and benchmark the following:

- baseline - vma locking conversion with TYPESAFE_BY_RCU
- baseline+maple tree node reduction from mm-unstable (Liam might point out
which patches?)
- the above + this series + sheaves enabled for vm_area_struct cache
- the above + full maple node sheaves conversion [1]
- the above + the top-most patches from [1] that are optimizations with a
tradeoff (not clear win-win) so it would be good to know if they are useful

[1] currently the 4 commits here:
https://web.git.kernel.org/pub/scm/linux/kernel/git/vbabka/linux.git/log/?h=slub-percpu-sheaves-v2-maple
from "maple_tree: Sheaf conversion" to "maple_tree: Clean up sheaf"
but as Liam noted, they won't cherry pick without conflict once maple tree
node reduction is backported, but he's working on a rebase

Thanks in advance!

>> > >
>> > > Results in (3) vs (2) indicate that using sheaves for vm_area_struct
>> > > yields slightly better averages and I noticed that this was mostly due
>> > > to sheaves results missing occasional spikes that worsened
>> > > TYPESAFE_BY_RCU averages (the results seemed more stable with
>> > > sheaves).
>> >
>> > Thanks a lot, that looks promising!
>>
>> Indeed, that looks better than I expected :)
>> Cheers!
>>
>> >
>> > > [1] https://lore.kernel.org/all/20250213224655.1680278-1-surenb@xxxxxxxxxx/
>> > >
>> >