On Wed, Aug 9, 2023 at 11:31 AM Suren Baghdasaryan <surenb@xxxxxxxxxx> wrote: > > On Wed, Aug 9, 2023 at 11:08 AM Suren Baghdasaryan <surenb@xxxxxxxxxx> wrote: > > > > On Wed, Aug 9, 2023 at 11:04 AM David Hildenbrand <david@xxxxxxxxxx> wrote: > > > > > > >>>> Which ends up being > > > >>>> > > > >>>> VM_BUG_ON_MM(!rwsem_is_locked(&mm->mmap_lock), mm); > > > >>>> > > > >>>> I did not check if this is also the case on mainline, and if this series is responsible. > > > >>> > > > >>> Thanks for reporting! I'm checking it now. > > > >> > > > >> Hmm. From the code it's not obvious how lock_mm_and_find_vma() ends up > > > >> calling find_vma() without mmap_lock after successfully completing > > > >> get_mmap_lock_carefully(). lock_mm_and_find_vma+0x3f/0x270 points to > > > >> the first invocation of find_vma(), so this is not even the lock > > > >> upgrade path... I'll try to reproduce this issue and dig up more but > > > >> from the information I have so far this issue does not seem to be > > > >> related to this series. > > > > > > I just checked on mainline and it does not fail there. > > Thanks. Just to eliminate the possibility, I'll try reverting my > patchset in mm-unstable and will try the test again. Will do that in > the evening once I'm home. > > > > > > > > > > > > This is really weird. I added mmap_assert_locked(mm) calls into > > > > get_mmap_lock_carefully() right after we acquire mmap_lock read lock > > > > and one of them triggers right after successful > > > > mmap_read_lock_killable(). Here is my modified version of > > > > get_mmap_lock_carefully(): > > > > > > > > static inline bool get_mmap_lock_carefully(struct mm_struct *mm, > > > > struct pt_regs *regs) { > > > > /* Even if this succeeds, make it clear we might have slept */ > > > > if (likely(mmap_read_trylock(mm))) { > > > > might_sleep(); > > > > mmap_assert_locked(mm); > > > > return true; > > > > } > > > > if (regs && !user_mode(regs)) { > > > > unsigned long ip = instruction_pointer(regs); > > > > if (!search_exception_tables(ip)) > > > > return false; > > > > } > > > > if (!mmap_read_lock_killable(mm)) { > > > > mmap_assert_locked(mm); <---- generates a BUG > > > > return true; > > > > } > > > > return false; > > > > } > > > > > > Ehm, that's indeed weird. > > > > > > > > > > > AFAIKT conditions for mmap_read_trylock() and > > > > mmap_read_lock_killable() are checked correctly. Am I missing > > > > something? > > > > > > Weirdly enough, it only triggers during that specific uffd test, right? > > > > Yes, uffd-unit-tests. I even ran it separately to ensure it's not some > > fallback from a previous test and I'm able to reproduce this > > consistently. Yeah, it is somehow related to per-vma locking. Unfortunately I can't reproduce the issue on my VM, so I have to use my host and bisection is slow. I think I'll get to the bottom of this tomorrow. > > > > > > > > -- > > > Cheers, > > > > > > David / dhildenb > > >