Re: Memory allocation on speculative fastpaths

Matthew Wilcox <willy@xxxxxxxxxxxxx> · Wed, 4 May 2022 01:22:02 +0100

On Wed, May 04, 2022 at 01:45:11AM +0200, Michal Hocko wrote:
> On Tue 03-05-22 16:15:46, Suren Baghdasaryan wrote:
> > On Tue, May 3, 2022 at 11:28 AM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
> [...]
> > > rcu_read_lock();
> > > vma = vma_lookup();
> > > if (down_read_trylock(&vma->sem)) {
> > >         rcu_read_unlock();
> > > } else {
> > >         rcu_read_unlock();
> > >         mmap_read_lock(mm);
> > >         vma = vma_lookup();
> > >         down_read(&vma->sem);
> > > }
> > >
> > > ... and we then execute the page table allocation under the protection of
> > > the vma->sem.
> > >
> > > At least, that's what I think we agreed to yesterday.
> > 
> > Honestly, I don't remember discussing vma->sem at all.
> 
> This is the rangelocking approach that is effectivelly per-VMA. So that
> should help with the most simplistic case where the mmap contention is
> not on the same VMAs which should be the most common case (e.g. faulting
> from several threads while there is mmap happening in the background).
> 
> There are cases where this could be too coarse of course and RCU would
> be a long term plan. The above seems easy enough and still probably good
> enough for most cases so a good first step.

It also fixes the low-pri monitoring daemon problem as page faults will
not be blocked by a writer (unless the read_trylock fails).

I see three potential outcomes here from the vma rwsem approach:

 - No particular improvement on any workloads.
   Result: we try something else.
 - Minor gains (5-10%).  We benchmark it and discover there's still
   significant contention on the vma_sem.
   Result: we take those wins and keep going towards a full RCU solution
 - Major gains (20-50%).
   Result: We're done, break out the champagne.