Re: [RFC PATCH 24/37] mm: implement speculative handling in __do_fault()

Matthew Wilcox <willy@xxxxxxxxxxxxx> · Wed, 7 Apr 2021 22:27:12 +0100

On Wed, Apr 07, 2021 at 02:20:27PM -0700, Michel Lespinasse wrote:
> On Wed, Apr 07, 2021 at 04:40:34PM +0200, Peter Zijlstra wrote:
> > On Tue, Apr 06, 2021 at 06:44:49PM -0700, Michel Lespinasse wrote:
> > > In the speculative case, call the vm_ops->fault() method from within
> > > an rcu read locked section, and verify the mmap sequence lock at the
> > > start of the section. A match guarantees that the original vma is still
> > > valid at that time, and that the associated vma->vm_file stays valid
> > > while the vm_ops->fault() method is running.
> > > 
> > > Note that this implies that speculative faults can not sleep within
> > > the vm_ops->fault method. We will only attempt to fetch existing pages
> > > from the page cache during speculative faults; any miss (or prefetch)
> > > will be handled by falling back to non-speculative fault handling.
> > > 
> > > The speculative handling case also does not preallocate page tables,
> > > as it is always called with a pre-existing page table.
> > 
> > So what's wrong with SRCU ? Laurent mumbled something about frequent
> > SRCU kthread activity being a problem; is that still so and is that
> > fundamentally unfixable?
> > 
> > Because to me it seems a much more natural solution to the whole thing.
> 
> The short answer is that I did not try SRCU. My thought process was,
> page cache already uses an RCU read lock, I just need to expand its
> scope a little.
> 
> Using SRCU might allow us to hit disk during speculative faults; OTOH
> we may need to switch to a more robust validation mechanism than the
> global counter to reap any benefits.

Why would you want to do I/O under SRCU?!  The benefit of SRCU is that
you can allocate page tables under SRCU.

Doing I/O without any lock held already works; it just uses the file
refcount.  It would be better to use a vma refcount, as I already said.