On Thu, Jun 27, 2019 at 12:59 PM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: > > On Thu, Jun 27, 2019 at 12:09:29PM -0700, Dan Williams wrote: > > > This bug feels like we failed to unlock, or unlocked the wrong entry > > > and this hunk in the bisected commit looks suspect to me. Why do we > > > still need to drop the lock now that the radix_tree_preload() calls > > > are gone? > > > > Nevermind, unmapp_mapping_pages() takes a sleeping lock, but then I > > wonder why we don't restart the lookup like the old implementation. > > We have the entry locked: > > /* > * Make sure 'entry' remains valid while we drop > * the i_pages lock. > */ > dax_lock_entry(xas, entry); > > /* > * Besides huge zero pages the only other thing that gets > * downgraded are empty entries which don't need to be > * unmapped. > */ > if (dax_is_zero_entry(entry)) { > xas_unlock_irq(xas); > unmap_mapping_pages(mapping, > xas->xa_index & ~PG_PMD_COLOUR, > PG_PMD_NR, false); > xas_reset(xas); > xas_lock_irq(xas); > } > > If something can remove a locked entry, then that would seem like the > real bug. Might be worth inserting a lookup there to make sure that it > hasn't happened, I suppose? Nope, added a check, we do in fact get the same locked entry back after dropping the lock. The deadlock revolves around the mmap_sem. One thread holds it for read and then gets stuck indefinitely in get_unlocked_entry(). Once that happens another rocksdb thread tries to mmap and gets stuck trying to take the mmap_sem for write. Then all new readers, including ps and top that try to access a remote vma, then get queued behind that write. It could also be the case that we're missing a wake up.