[CC RCU and Frederic and Matthew] Hi Paul After several days' experiment on my Thinkpad P1 gen 4 (Intel 11800H 8 core 16 threads), I found that it is CONFIG_NFS_V4 that makes the difference. If I remove CONFIG_NFS_V4 from TASKS01, the probability of triggering the warning is significantly increased! And, Yes, there is no CONFIG_NFS_V4 in Matthew's original email. This is very interesting. After debugging the kernel, I found init_nfs_v4 does a lot of work, did it lend grace period to the test? I am very happy to continue to do research on this topic ;-) Thank you for your guidance! Thanks Zhouyi On Tue, Mar 22, 2022 at 8:31 PM Zhouyi Zhou <zhouzhouyi@xxxxxxxxx> wrote: > > Dear Frederic > > I may not be right, please correct me if so > > On Tue, Mar 22, 2022 at 6:49 PM Frederic Weisbecker <frederic@xxxxxxxxxx> wrote: > > > > On Mon, Mar 21, 2022 at 08:57:46AM -0700, Paul E. McKenney wrote: > > > On Mon, Mar 21, 2022 at 11:46:28PM +0800, Zhouyi Zhou wrote: > > > > Hi Paul and Willy > > > > > > > > I can reproduce the bug. Following is what I do: > > > > 1.1 git clone https://kernel.source.codeaurora.cn/pub/scm/linux/kernel/git/torvalds/linux.git > > > > 1.2 cd linux > > > > 1.3 cp http://154.223.142.244/20220321/config-20220321 to .config > > > > 1.4 make vmlinux -j 16 > > > > 1.5 kvm -smp 4 -net none -serial file:/tmp/console.log -m 512 > > > > -kernel vmlinux -append "console=ttyS0" > > > > 1.6 the /tmp/console.log is uploaded to > > > > http://154.223.142.244/20220321/console.log > > > > > > > > 2.1 wget https://kernel.source.codeaurora.cn/pub/scm/linux/kernel/git/torvalds/linux.git/snapshot/linux-5.17-rc6.tar.gz > > > > 2.2 - 2.6 the result is similar. > > > > > > > > I am very interested in this topic. > > > > Could you please give me about a week to make a full understand of the > > > > meaning of the warning, and try the fixes one by one, and > > > > find out what happens? > > > > > > Works for me! The eventual fix likely involves some version of Valentin > > > Schneider's patchset that provides APIs that allow RCU to detect the > > > current preemption state. Which can change at runtime. A prototype of > > > this patch is on -rcu here: > > > > > > 2436ee0b4cea ("EXP preempt/dynamic: Introduce preempt mode accessors") > > > > > > It is entirely possible that the fix might need to go to mainline sooner > > > rather than later. > > > > I guess it's possible to do that but that patch alone shouldn't fix anything. > > Also, what is the issue exactly? :-) > The issue is with certain kernel configuration > (http://154.223.142.244/20220321/config-2022032), the mainline (and > -rcu ) kernel will warn > "call_rcu_tasks() has been failed" in rcu_tasks_verify_self_tests. > (http://154.223.142.244/20220321/console.log) > > Kind Regard > Zhouyi > > I can't rewind far enough the conversation. > > > > Thanks.