On Fri, Oct 21, 2022 at 05:00:24PM -0700, Ira Weiny wrote: > On Fri, Oct 21, 2022 at 03:48:57PM -0700, Ira wrote: > > On Fri, Oct 21, 2022 at 01:30:41PM -0700, Andrew Morton wrote: > > > On Fri, 21 Oct 2022 14:09:16 +0100 Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: > > > > > > > On Fri, Oct 21, 2022 at 12:10:17PM +0800, kernel test robot wrote: > > > > > FYI, we noticed WARNING:possible_recursive_locking_detected due to commit (built with gcc-11): > > > > > > > > > > commit: 7a7256d5f512b6c17957df7f59cf5e281b3ddba3 ("shmem: convert shmem_mfill_atomic_pte() to use a folio") > > > > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master > > > > > > > > Ummm. Looks to me like this now occurs because of this part of the > > > > change: > > > > > > > > if (!zeropage) { /* COPY */ > > > > - page_kaddr = kmap_atomic(page); > > > > + page_kaddr = kmap_local_folio(folio, 0); > > > > ret = copy_from_user(page_kaddr, > > > > (const void __user *)src_addr, > > > > PAGE_SIZE); > > > > - kunmap_atomic(page_kaddr); > > > > + kunmap_local(page_kaddr); > > > > > > > > Should I be using __copy_from_user_inatomic() here? > > > > I would say not. I'm curious why copy_from_user() was safe (at least did not > > fail the checkers). :-/ > > > > > > > > Caller __mcopy_atomic() is holding mmap_read_lock(dst_mm) and this > > > copy_from_user() calls > > > might_fault()->might_lock_read(current->mm->mmap_lock). > > > > > > And I guess might_lock_read() gets upset because we're holding another > > > mm's mmap_lock. Which sounds OK to me, unless a concurrent > > > mmap_write_lock() could jam things up. > > > > > > But I cannot see why your patch would suddenly trigger this warning - > > > kmap_local_folio() and kmap_atomic() are basically the same thing. > > > > It is related to your patch but I think what you did made sense on the surface. > > > > On the surface copy_from_user() should not require pagefaults to be disabled. > > But that side affect of kmap_atomic() was being used here because it looks like > > the code is designed to fallback if the fault was not allowed:[1] > > > > mm/shmem.c > > ... > > page_kaddr = kmap_local_folio(folio, 0); > > ret = copy_from_user(page_kaddr, > > (const void __user *)src_addr, > > PAGE_SIZE); > > kunmap_local(page_kaddr); > > > > /* fallback to copy_from_user outside mmap_lock */ > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > if (unlikely(ret)) { > > *pagep = &folio->page; > > ret = -ENOENT; > > /* don't free the page */ > > goto out_unacct_blocks; > > } > > ... > > > > So this is one of those rare places where the kmap_atomic() side effects were > > being depended on... :-( > > > > [1] might_fault() does not actually mean the code completes the fault. > > > > mm/memory.c > > ... > > void __might_fault(const char *file, int line) > > { > > if (pagefault_disabled()) > > return; > > ... > > > > > > > > I see that __mcopy_atomic() is using plain old kmap(), perhaps to work > > > around this? But that's 2015 code and I'm not sure we had such > > > detailed lock checking in those days. > > > > No kmap() can't work around this. That works because the lock is released just > > above that. > > > > mm/userfaultfd.c > > ... > > mmap_read_unlock(dst_mm); > > BUG_ON(!page); > > > > page_kaddr = kmap(page); > > err = copy_from_user(page_kaddr, > > (const void __user *) src_addr, > > PAGE_SIZE); > > kunmap(page); > > ... > > > > So I think the correct solution is below because we want to prevent the page > > fault. > > I was about to get this patch ready to send when I found this: > > commit b6ebaedb4cb1a18220ae626c3a9e184ee39dd248 > Author: Andrea Arcangeli <aarcange@xxxxxxxxxx> > Date: Fri Sep 4 15:47:08 2015 -0700 > > userfaultfd: avoid mmap_sem read recursion in mcopy_atomic > > If the rwsem starves writers it wasn't strictly a bug but lockdep > doesn't like it and this avoids depending on lowlevel implementation > details of the lock. > > [akpm@xxxxxxxxxxxxxxxxxxxx: delete weird BUILD_BUG_ON()] > Signed-off-by: Andrea Arcangeli <aarcange@xxxxxxxxxx> > Acked-by: Pavel Emelyanov <xemul@xxxxxxxxxxxxx> > ... > > So I wonder if the true fix is something to lockdep? I think lockdep used to complain because we can be taking the same mmap_sem twice in this case (the 2nd one during the useraddr page fault). So to answer the other question - yeah the current->mm and dest_mm can definitely be the same one in this context. > > Regardless I'll send the below patch because it will restore things to a > working order. > > But I'm CC'ing Andrea for comments. Open-code disabling of pagefault sounds okay to me. pagefault_disable() used to be covering the kmap procedure too as done in kmap_atomic(), but frankly I don't know whether there's a real difference. Yeah, let's see whether we can get a confirmation from Andrea. Thanks, -- Peter Xu