On Mon, Jul 19, 2021 at 03:43:00AM +0100, Matthew Wilcox wrote: > On Mon, Jul 19, 2021 at 10:24:18AM +0800, Zhouyi Zhou wrote: > > Meanwhile, I examined the 5.12.17 by naked eye, and found a suspicious place > > that could possibly trigger that problem: > > > > struct swap_info_struct *get_swap_device(swp_entry_t entry) > > { > > struct swap_info_struct *si; > > unsigned long offset; > > > > if (!entry.val) > > goto out; > > si = swp_swap_info(entry); > > if (!si) > > goto bad_nofile; > > > > rcu_read_lock(); > > if (data_race(!(si->flags & SWP_VALID))) > > goto unlock_out; > > offset = swp_offset(entry); > > if (offset >= si->max) > > goto unlock_out; > > > > return si; > > bad_nofile: > > pr_err("%s: %s%08lx\n", __func__, Bad_file, entry.val); > > out: > > return NULL; > > unlock_out: > > rcu_read_unlock(); > > return NULL; > > } > > I guess the function "return si" without a rcu_read_unlock. > > Yes, but the caller is supposed to call put_swap_device() which > calls rcu_read_unlock(). See commit eb085574a752. Right, but we need to make sure there is no sleepable function called before put_swap_device() called, and the call trace showed the following happened: do_swap_page(): si = get_swap_device(): rcu_read_lock(); lock_page_or_retry(): might_sleep(); // call a sleepable function inside RCU read-side c.s. __lock_page_or_retry(): wait_on_page_bit_common(): schedule(): rcu_note_context_switch(); // Warn here put_swap_device(); rcu_read_unlock(); , which introduced by commit 2799e77529c2a [Copy the author] Regards, Boqun