On Mon, Mar 27, 2023 at 06:45:15PM +0100, Matthew Wilcox (Oracle) wrote: > Prevent filesystems from doing things which sleep in their map_pages > method. This is in preparation for a pagefault path protected only > by RCU. > > Signed-off-by: Matthew Wilcox (Oracle) <willy@xxxxxxxxxxxxx> > --- > Documentation/filesystems/locking.rst | 4 ++-- > mm/memory.c | 11 ++++++++--- > 2 files changed, 10 insertions(+), 5 deletions(-) > > diff --git a/Documentation/filesystems/locking.rst b/Documentation/filesystems/locking.rst > index 922886fefb7f..8a80390446ba 100644 > --- a/Documentation/filesystems/locking.rst > +++ b/Documentation/filesystems/locking.rst > @@ -645,7 +645,7 @@ ops mmap_lock PageLocked(page) > open: yes > close: yes > fault: yes can return with page locked > -map_pages: yes > +map_pages: read > page_mkwrite: yes can return with page locked > pfn_mkwrite: yes > access: yes > @@ -661,7 +661,7 @@ locked. The VM will unlock the page. > > ->map_pages() is called when VM asks to map easy accessible pages. > Filesystem should find and map pages associated with offsets from "start_pgoff" > -till "end_pgoff". ->map_pages() is called with page table locked and must > +till "end_pgoff". ->map_pages() is called with the RCU lock held and must > not block. If it's not possible to reach a page without blocking, > filesystem should skip it. Filesystem should use set_pte_range() to setup > page table entry. Pointer to entry associated with the page is passed in > diff --git a/mm/memory.c b/mm/memory.c > index 8071bb17abf2..a7edf6d714db 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -4461,6 +4461,7 @@ static vm_fault_t do_fault_around(struct vm_fault *vmf) > /* The page offset of vmf->address within the VMA. */ > pgoff_t vma_off = vmf->pgoff - vmf->vma->vm_pgoff; > pgoff_t from_pte, to_pte; > + vm_fault_t ret; > > /* The PTE offset of the start address, clamped to the VMA. */ > from_pte = max(ALIGN_DOWN(pte_off, nr_pages), > @@ -4476,9 +4477,13 @@ static vm_fault_t do_fault_around(struct vm_fault *vmf) > return VM_FAULT_OOM; > } > > - return vmf->vma->vm_ops->map_pages(vmf, > - vmf->pgoff + from_pte - pte_off, > - vmf->pgoff + to_pte - pte_off); > + rcu_read_lock(); > + ret = vmf->vma->vm_ops->map_pages(vmf, > + vmf->pgoff + from_pte - pte_off, > + vmf->pgoff + to_pte - pte_off); > + rcu_read_unlock(); > + > + return ret; Doesn't this mean that the rcu_read_lock/unlock can be removed from filemap_map_pages()? i.e. all callers are now already under rcu_read_lock(). Maybe WARN_ON_ONCE(!rcu_read_lock_held()) could be put in filemap_map_pages() if you are worried about callers not holding it... Otherwise it looks fine. -Dave. -- Dave Chinner david@xxxxxxxxxxxxx