I noticed this path while I was doing the 4.19 backport of dax: Avoid losing wakeup in dax_lock_mapping_entry xa_unlock_irq(&mapping->i_pages); revalidate = wait_fn(); finish_wait(wq, &ewait.wait); xa_lock_irq(&mapping->i_pages); It's not safe to call xa_lock_irq() if mapping can have been freed while we slept. We'll probably get away with it; most filesystems use a unique slab for their inodes, so you'll likely get either a freed inode or an inode which is now the wrong inode. But if that page has been freed back to the page allocator, that pointer could now be pointing at anything. Fixing this in the current codebase is no easier than fixing it in the 4.19 codebase. This is the best I've come up with. Could we do better by not using the _exclusive form of prepare_to_wait()? I'm not familiar with all the things that need to be considered when using this family of interfaces. diff --git a/fs/dax.c b/fs/dax.c index 9bcce89ea18e..154b592b18eb 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -232,6 +232,24 @@ static void *get_unlocked_entry(struct xa_state *xas) } } +static void wait_unlocked_entry(struct xa_state *xas, void *entry) +{ + struct wait_exceptional_entry_queue ewait; + wait_queue_head_t *wq; + + init_wait(&ewait.wait); + ewait.wait.func = wake_exceptional_entry_func; + + wq = dax_entry_waitqueue(xas, entry, &ewait.key); + prepare_to_wait_exclusive(wq, &ewait.wait, TASK_UNINTERRUPTIBLE); + xas_unlock_irq(xas); + /* We can no longer look at xas */ + schedule(); + finish_wait(wq, &ewait.wait); + if (waitqueue_active(wq)) + __wake_up(wq, TASK_NORMAL, 1, &ewait.key); +} + static void put_unlocked_entry(struct xa_state *xas, void *entry) { /* If we were the only waiter woken, wake the next one */ @@ -389,9 +407,7 @@ bool dax_lock_mapping_entry(struct page *page) entry = xas_load(&xas); if (dax_is_locked(entry)) { rcu_read_unlock(); - entry = get_unlocked_entry(&xas); - xas_unlock_irq(&xas); - put_unlocked_entry(&xas, entry); + wait_unlocked_entry(&xas, entry); rcu_read_lock(); continue; }