On 2021/7/19 18:14, Boqun Feng wrote: > On Mon, Jul 19, 2021 at 03:43:00AM +0100, Matthew Wilcox wrote: >> On Mon, Jul 19, 2021 at 10:24:18AM +0800, Zhouyi Zhou wrote: >>> Meanwhile, I examined the 5.12.17 by naked eye, and found a suspicious place >>> that could possibly trigger that problem: >>> >>> struct swap_info_struct *get_swap_device(swp_entry_t entry) >>> { >>> struct swap_info_struct *si; >>> unsigned long offset; >>> >>> if (!entry.val) >>> goto out; >>> si = swp_swap_info(entry); >>> if (!si) >>> goto bad_nofile; >>> >>> rcu_read_lock(); >>> if (data_race(!(si->flags & SWP_VALID))) >>> goto unlock_out; >>> offset = swp_offset(entry); >>> if (offset >= si->max) >>> goto unlock_out; >>> >>> return si; >>> bad_nofile: >>> pr_err("%s: %s%08lx\n", __func__, Bad_file, entry.val); >>> out: >>> return NULL; >>> unlock_out: >>> rcu_read_unlock(); >>> return NULL; >>> } >>> I guess the function "return si" without a rcu_read_unlock. >> >> Yes, but the caller is supposed to call put_swap_device() which >> calls rcu_read_unlock(). See commit eb085574a752. > > Right, but we need to make sure there is no sleepable function called > before put_swap_device() called, and the call trace showed the following > happened: > > do_swap_page(): > si = get_swap_device(): > rcu_read_lock(); > lock_page_or_retry(): > might_sleep(); // call a sleepable function inside RCU read-side c.s. > __lock_page_or_retry(): > wait_on_page_bit_common(): > schedule(): > rcu_note_context_switch(); > // Warn here > put_swap_device(); > rcu_read_unlock(); > > , which introduced by commit 2799e77529c2a When in the commit 2799e77529c2a, we're using the percpu_ref to serialize against concurrent swapoff, i.e. there's percpu_ref inside get_swap_device() instead of rcu_read_lock(). Please see commit 63d8620ecf93 ("mm/swapfile: use percpu_ref to serialize against concurrent swapoff") for detail. Thanks. > > [Copy the author] > > Regards, > Boqun > > . >