Re: [PATCH RFC v2 00/10] SLUB percpu sheaves

Suren Baghdasaryan <surenb@xxxxxxxxxx> · Tue, 4 Mar 2025 10:35:12 -0800

On Tue, Mar 4, 2025 at 2:55 AM Vlastimil Babka <vbabka@xxxxxxx> wrote:
>
> On 2/25/25 21:26, Suren Baghdasaryan wrote:
> > On Mon, Feb 24, 2025 at 1:12 PM Suren Baghdasaryan <surenb@xxxxxxxxxx> wrote:
> >>
> >> >
> >> > > The values represent the total time it took to perform mmap syscalls, less is
> >> > > better.
> >> > >
> >> > > (1)                  baseline       control
> >> > > Little core       7.58327       6.614939 (-12.77%)
> >> > > Medium core  2.125315     1.428702 (-32.78%)
> >> > > Big core          0.514673     0.422948 (-17.82%)
> >> > >
> >> > > (2)                  baseline      control
> >> > > Little core       7.58327       5.141478 (-32.20%)
> >> > > Medium core  2.125315     0.427692 (-79.88%)
> >> > > Big core          0.514673    0.046642 (-90.94%)
> >> > >
> >> > > (3)                   baseline      control
> >> > > Little core        7.58327      4.779624 (-36.97%)
> >> > > Medium core   2.125315    0.450368 (-78.81%)
> >> > > Big core           0.514673    0.037776 (-92.66%)
> >
> > (4)                   baseline      control
> > Little core        7.58327      4.642977 (-38.77%)
> > Medium core   2.125315    0.373692 (-82.42%)
> > Big core           0.514673    0.043613 (-91.53%)
> >
> > I think the difference between (3) and (4) is noise.
> > Thanks,
> > Suren.
>
> Hi, as we discussed yesterday, it would be useful to set the baseline to
> include everything before sheaves as that's already on the way to 6.15, so
> we can see more clearly what sheaves do relative to that. So at this point
> it's the vma lock conversion including TYPESAFE_BY_RCU (that's not undone,
> thus like in scenario (4)), and benchmark the following:
>
> - baseline - vma locking conversion with TYPESAFE_BY_RCU
> - baseline+maple tree node reduction from mm-unstable (Liam might point out
> which patches?)
> - the above + this series + sheaves enabled for vm_area_struct cache
> - the above + full maple node sheaves conversion [1]
> - the above + the top-most patches from [1] that are optimizations with a
> tradeoff (not clear win-win) so it would be good to know if they are useful
>
> [1] currently the 4 commits here:
> https://web.git.kernel.org/pub/scm/linux/kernel/git/vbabka/linux.git/log/?h=slub-percpu-sheaves-v2-maple
> from "maple_tree: Sheaf conversion" to "maple_tree: Clean up sheaf"
> but as Liam noted, they won't cherry pick without conflict once maple tree
> node reduction is backported, but he's working on a rebase
>
> Thanks in advance!

Sure, I'll run the tests and post results sometime later this week.
Thanks!

>
> >> > >
> >> > > Results in (3) vs (2) indicate that using sheaves for vm_area_struct
> >> > > yields slightly better averages and I noticed that this was mostly due
> >> > > to sheaves results missing occasional spikes that worsened
> >> > > TYPESAFE_BY_RCU averages (the results seemed more stable with
> >> > > sheaves).
> >> >
> >> > Thanks a lot, that looks promising!
> >>
> >> Indeed, that looks better than I expected :)
> >> Cheers!
> >>
> >> >
> >> > > [1] https://lore.kernel.org/all/20250213224655.1680278-1-surenb@xxxxxxxxxx/
> >> > >
> >> >
>