Re: [PATCH 1/2] torture: use for_each_present() loop in torture_online_all()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Nov 17, 2022 at 07:06:37AM -0800, Paul E. McKenney wrote:
> On Thu, Nov 17, 2022 at 07:30:32AM +0100, Sven Schnelle wrote:
> > Hi Paul,
> > 
> > "Paul E. McKenney" <paulmck@xxxxxxxxxx> writes:
> > 
> > >> > Yes, rcutorture has lower-level checks for CPUs being hotplugged
> > >> > behind its back.  Which might be sufficient.  But this patch is in
> > >> > response to something bad happening if the CPU is also not present in
> > >> > the cpu_present_mask.  Would that same bad thing happen if rcutorture saw
> > >> > the CPU in cpu_online_mask, but by the time it attempted to CPU-hotplug
> > >> > it, that CPU was gone not just from cpu_online_mask, but also from
> > >> > cpu_present_mask?
> > >> >
> > >> > Or are CPUs never removed from cpu_present_mask?
> > >> 
> > >> In the current implementation CPUs can only be added to the
> > >> cpu_present_mask, but never removed. This might change in the future
> > >> when we get support from firmware for that, but the current s390 code
> > >> doesn't do that.
> > >
> > > Very good!
> > >
> > > Then could the patch please check that bits are never removed?
> > > That way the code will complain should firmware support be added.
> > >
> > > 							Thanx, Paul
> > 
> > I'm not sure whether i fully understand that. If the CPU could
> > be removed from the system and the cpu_present_mask, that could
> > happen at any time. So i don't see how we should check about that?
> 
> Well, that is my question to you.  ;-)
> 
> Suppose we have the following sequence of events:
> 
> o	rcutorture sees that CPU 5 is in cpu_present_mask, but offline.
> 
> o	rcutorture therefore decides to online CPU 5.
> 
> o	s390 firmware removes CPU 5, and s390 architecture code then
> 	clears it from the cpu_present_mask.
> 
> o	rcutorture proceeds with onlining CPU 5.
> 
> Don't we then get the same problem that prompted you to change from
> cpu_possible_mask to cpu_present mask?  If not, why can't the rcutorture
> code continue to use cpu_possible_mask?
> 
> If it really is bad to try to online or offline a CPU that is in
> cpu_possible_mask but not in cpu_present_mask, and if CPUs can be removed
> from cpu_present_mask, then we need some way to synchronize the removal
> of CPUs from cpu_present_mask.  There are of course a lot of possible
> ways to do that synchronization, for example, protecting cpu_present_mask
> with a mutex or similar.
> 
> Alternatively, s390 could restrict things.  One way to do that would
> be to turn off rcutorture's use of CPU hotplug when running on s390,
> for example, by using the module parameters provided for that purpose.
> Another way to do that would be to refrain from removing CPUs from
> cpu_present_mask while rcutorture is running.
> 
> Are there other approaches?

For the near term, why not have rcutorture keep a snapshot of
cpu_present_mask, and splat if a CPU is ever removed from that mask?

That would catch any issues, and defer any synchronization decisions to
a time at which we actually have some chance of knowing what is going on.

							Thanx, Paul



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux