On Wed, Apr 07, 2021 at 02:20:27PM -0700, Michel Lespinasse wrote: > On Wed, Apr 07, 2021 at 04:40:34PM +0200, Peter Zijlstra wrote: > > On Tue, Apr 06, 2021 at 06:44:49PM -0700, Michel Lespinasse wrote: > > > In the speculative case, call the vm_ops->fault() method from within > > > an rcu read locked section, and verify the mmap sequence lock at the > > > start of the section. A match guarantees that the original vma is still > > > valid at that time, and that the associated vma->vm_file stays valid > > > while the vm_ops->fault() method is running. > > > > > > Note that this implies that speculative faults can not sleep within > > > the vm_ops->fault method. We will only attempt to fetch existing pages > > > from the page cache during speculative faults; any miss (or prefetch) > > > will be handled by falling back to non-speculative fault handling. > > > > > > The speculative handling case also does not preallocate page tables, > > > as it is always called with a pre-existing page table. > > > > So what's wrong with SRCU ? Laurent mumbled something about frequent > > SRCU kthread activity being a problem; is that still so and is that > > fundamentally unfixable? > > > > Because to me it seems a much more natural solution to the whole thing. > > The short answer is that I did not try SRCU. My thought process was, > page cache already uses an RCU read lock, I just need to expand its > scope a little. > > Using SRCU might allow us to hit disk during speculative faults; OTOH > we may need to switch to a more robust validation mechanism than the > global counter to reap any benefits. Why would you want to do I/O under SRCU?! The benefit of SRCU is that you can allocate page tables under SRCU. Doing I/O without any lock held already works; it just uses the file refcount. It would be better to use a vma refcount, as I already said.