Re: Selftest failures related to kern_sync_rcu()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Apr 14, 2021 at 09:18:09PM +0200, Toke Høiland-Jørgensen wrote:
> "Paul E. McKenney" <paulmck@xxxxxxxxxx> writes:
> 
> > On Wed, Apr 14, 2021 at 08:39:04PM +0200, Toke Høiland-Jørgensen wrote:
> >> "Paul E. McKenney" <paulmck@xxxxxxxxxx> writes:
> >> 
> >> > On Wed, Apr 14, 2021 at 10:59:23AM -0700, Alexei Starovoitov wrote:
> >> >> On Wed, Apr 14, 2021 at 10:52 AM Paul E. McKenney <paulmck@xxxxxxxxxx> wrote:
> >> >> >
> >> >> > > > > >                 if (num_online_cpus() > 1)
> >> >> > > > > >                         synchronize_rcu();
> >> >> >
> >> >> > In CONFIG_PREEMPT_NONE=y and CONFIG_PREEMPT_VOLUNTARY=y kernels, this
> >> >> > synchronize_rcu() will be a no-op anyway due to there only being the
> >> >> > one CPU.  Or are these failures all happening in CONFIG_PREEMPT=y kernels,
> >> >> > and in tests where preemption could result in the observed failures?
> >> >> >
> >> >> > Could you please send your .config file, or at least the relevant portions
> >> >> > of it?
> >> >> 
> >> >> That's my understanding as well. I assumed Toke has preempt=y.
> >> >> Otherwise the whole thing needs to be root caused properly.
> >> >
> >> > Given that there is only a single CPU, I am still confused about what
> >> > the tests are expecting the membarrier() system call to do for them.
> >> 
> >> It's basically a proxy for waiting until the objects are freed on the
> >> kernel side, as far as I understand...
> >
> > There are in-kernel objects that are freed via call_rcu(), and the idea
> > is to wait until these objects really are freed?  Or am I still missing
> > out on what is going on?
> 
> Something like that? Although I'm not actually sure these are using
> call_rcu()? One of them needs __put_task_struct() to run, and the other
> waits for map freeing, with this comment:
> 
> 
> 	/* we need to either wait for or force synchronize_rcu(), before
> 	 * checking for "still exists" condition, otherwise map could still be
> 	 * resolvable by ID, causing false positives.
> 	 *
> 	 * Older kernels (5.8 and earlier) freed map only after two
> 	 * synchronize_rcu()s, so trigger two, to be entirely sure.
> 	 */
> 	CHECK(kern_sync_rcu(), "sync_rcu", "failed\n");
> 	CHECK(kern_sync_rcu(), "sync_rcu", "failed\n");

OK, so the issue is that the membarrier() system call is designed to force
ordering only within a user process, and you need it in the kernel.

Give or take my being puzzled as to why the membarrier() system call
doesn't do it for you on a CONFIG_PREEMPT_NONE=y system, this brings
us back to the question Alexei asked me in the first place, what is the
best way to invoke an in-kernel synchronize_rcu() from userspace?

You guys gave some reasonable examples.  Here are a few others:

o	Bring a CPU online, then force it offline, or vice versa.
	But in this case, sys_membarrier() would do what you need
	given more than one CPU.

o	Use the membarrier() system call, but require that the tests
	run on systems with at least two CPUs.

o	Create a kernel module whose init function does a
	synchronize_rcu() and then returns failure.  This will
	avoid the overhead of removing that kernel module.

o	Create a sysfs or debugfs interface that does a
	synchronize_rcu().

But I am still concerned that you are needing more than synchronize_rcu()
can do.  Otherwise, the membarrier() system call would work just fine
on a single CPU on your CONFIG_PREEMPT_VOLUNTARY=y kernel.

							Thanx, Paul



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux