Re: [PATCHv2 3/3] rcu: coordinate tick dependency during concurrent offlining

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Oct 02, 2022 at 12:11:07PM -0400, Joel Fernandes wrote:
> 
> 
> On 10/2/2022 10:06 AM, Pingfan Liu wrote:
> > On Fri, Sep 30, 2022 at 9:04 PM Joel Fernandes <joel@xxxxxxxxxxxxxxxxx> wrote:
> >>
> >> On Thu, Sep 29, 2022 at 4:21 AM Pingfan Liu <kernelfans@xxxxxxxxx> wrote:
> >>>
> >>> On Thu, Sep 29, 2022 at 4:19 PM Pingfan Liu <kernelfans@xxxxxxxxx> wrote:
> >>>>
> >>> [...]
> >>>> "
> >>>>
> >>>> I have no idea whether this is related to the reverted commit.
> >>>>
> >>>
> >>> I have started another test against clean v6.0-rc7 to see whether this
> >>> is an issue with the mainline.
> >>
> >> I am not sure what exactly you are reverting (you could clarify that),
> > 
> > commit 96926686deab ("rcu: Make CPU-hotplug removal operations enable tick").
> > But due to conflict, "git revert" can not work directly. So I applied
> > it by hand.
> > 
> >> but if you are just removing the entire TICK_DEP_BIT_RCU, I do
> >> remember (and mentioned on IRC to others recently) that without this
> >> NOHZ_FULL has a hard time ending grace-periods because the forcing of
> >> tick is needed for this configuration if we are spinning in the kernel
> >> with the tick turned off. That seems to align with your TREE04
> >> (NOHZ_FULL) configuration.
> >>
> > 
> > Yes, that is the scenario.
> > 
> >> Also, the commit Frederic suggested to revert seems to be a cosmetic
> >> optimization in the interrupt-entry path. That should not change
> >> functionality I believe. So I did not fully follow why reverting that
> >> is relevant (maybe Frederic can clarify?).
> >>
> > 
> > Leave this question to Frederic.
> 
> I take this comment back, Sorry. Indeed the commits Frederic mentioned will make
> a functional change to CPU hotplug path.
> 
> Sorry for the noise.
> 
> Excited to see exact reason why TICK_DEP_BIT_RCU matters in the hotplug paths. I
> might jump into the investigation with you guys, but I have to make time for
> Lazy-RCU v7 next :)

One historical reason was that a nohz_full CPU could enter the kernel with
the tick still disabled, and stay that way indefinitely.  Among other
things, this can interfere with the grace-period wait in the offlining
code path.

							Thanx, Paul



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux