On Mon, Jul 19, 2021 at 9:53 AM Paul E. McKenney <paulmck@xxxxxxxxxx> wrote: > > On Sun, Jul 18, 2021 at 11:51:36PM +0100, Matthew Wilcox wrote: > > On Sun, Jul 18, 2021 at 02:59:14PM -0700, Paul E. McKenney wrote: > > > > > https://lore.kernel.org/lkml/CAK2bqVK0Q9YcpakE7_Rc6nr-E4e2GnMOgi5jJj=_Eh_1k > > > > > EHLHA@xxxxxxxxxxxxxx/ > > > > > > But this one does show this warning in v5.12.17: > > > > > > WARN_ON_ONCE(!preempt && rcu_preempt_depth() > 0); > > > > > > This is in rcu_note_context_switch(), and could be caused by something > > > like a schedule() within an RCU read-side critical section. This would > > > of course be RCU-usage bugs, given that you are not permitted to block > > > within an RCU read-side critical section. > > > > > > I suggest checking the functions in the stack trace to see where the > > > rcu_read_lock() is hiding. CONFIG_PROVE_LOCKING might also be helpful. > > > > I'm not sure I see it in this stack trace. > > > > Is it possible that there's something taking the rcu read lock in an > > interrupt handler, then returning from the interrupt handler without > > releasing the rcu lock? Do we have debugging that would fire if > > somebody did this? > > Lockdep should complain, but in the absence of lockdep I don't know > that anything would gripe in this situation. I think Lockdep should complain. Meanwhile, I examined the 5.12.17 by naked eye, and found a suspicious place that could possibly trigger that problem: struct swap_info_struct *get_swap_device(swp_entry_t entry) { struct swap_info_struct *si; unsigned long offset; if (!entry.val) goto out; si = swp_swap_info(entry); if (!si) goto bad_nofile; rcu_read_lock(); if (data_race(!(si->flags & SWP_VALID))) goto unlock_out; offset = swp_offset(entry); if (offset >= si->max) goto unlock_out; return si; bad_nofile: pr_err("%s: %s%08lx\n", __func__, Bad_file, entry.val); out: return NULL; unlock_out: rcu_read_unlock(); return NULL; } I guess the function "return si" without a rcu_read_unlock. However the get_swap_device has changed in the mainline tree, there is no rcu_read_lock anymore. > > Also, this is a preemptible kernel, so it is possible to trace > __rcu_read_lock(), if that helps. > > Thanx, Paul Thanx Zhouyi