Re: Fw: rc6 splat

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[CC RCU and Frederic and Matthew]

Hi Paul

After several days'  experiment on my Thinkpad P1 gen 4 (Intel 11800H
8 core 16 threads), I found that it is CONFIG_NFS_V4 that makes the
difference.
If I remove CONFIG_NFS_V4 from TASKS01, the probability of triggering
the warning is significantly increased!

And, Yes, there is no CONFIG_NFS_V4 in Matthew's original email.

This is very interesting. After debugging the kernel, I found
init_nfs_v4 does a lot of work, did it lend grace period to the test?
I am very happy to continue to do research on this topic ;-)

Thank you for your guidance!

Thanks
Zhouyi

On Tue, Mar 22, 2022 at 8:31 PM Zhouyi Zhou <zhouzhouyi@xxxxxxxxx> wrote:
>
> Dear Frederic
>
> I may not be right, please correct me if so
>
> On Tue, Mar 22, 2022 at 6:49 PM Frederic Weisbecker <frederic@xxxxxxxxxx> wrote:
> >
> > On Mon, Mar 21, 2022 at 08:57:46AM -0700, Paul E. McKenney wrote:
> > > On Mon, Mar 21, 2022 at 11:46:28PM +0800, Zhouyi Zhou wrote:
> > > > Hi Paul and Willy
> > > >
> > > > I can reproduce the bug. Following is what I do:
> > > > 1.1 git clone https://kernel.source.codeaurora.cn/pub/scm/linux/kernel/git/torvalds/linux.git
> > > > 1.2 cd linux
> > > > 1.3 cp http://154.223.142.244/20220321/config-20220321 to .config
> > > > 1.4 make vmlinux -j 16
> > > > 1.5 kvm -smp 4 -net none     -serial file:/tmp/console.log -m 512
> > > > -kernel vmlinux -append "console=ttyS0"
> > > > 1.6 the /tmp/console.log is uploaded to
> > > > http://154.223.142.244/20220321/console.log
> > > >
> > > > 2.1 wget https://kernel.source.codeaurora.cn/pub/scm/linux/kernel/git/torvalds/linux.git/snapshot/linux-5.17-rc6.tar.gz
> > > > 2.2 - 2.6 the result is similar.
> > > >
> > > > I am very interested in this topic.
> > > > Could you please give me about a week to make a full understand of the
> > > > meaning of the warning, and try the fixes one by one, and
> > > > find out what happens?
> > >
> > > Works for me!  The eventual fix likely involves some version of Valentin
> > > Schneider's patchset that provides APIs that allow RCU to detect the
> > > current preemption state.  Which can change at runtime.  A prototype of
> > > this patch is on -rcu here:
> > >
> > > 2436ee0b4cea ("EXP preempt/dynamic: Introduce preempt mode accessors")
> > >
> > > It is entirely possible that the fix might need to go to mainline sooner
> > > rather than later.
> >
> > I guess it's possible to do that but that patch alone shouldn't fix anything.
> > Also, what is the issue exactly? :-)
> The issue is with certain kernel configuration
> (http://154.223.142.244/20220321/config-2022032), the mainline (and
> -rcu ) kernel  will warn
> "call_rcu_tasks() has been failed" in rcu_tasks_verify_self_tests.
> (http://154.223.142.244/20220321/console.log)
>
> Kind Regard
> Zhouyi
> > I can't rewind far enough the conversation.
> >
> > Thanks.



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux