Re: [PATCHv2 3/3] rcu: coordinate tick dependency during concurrent offlining

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On Oct 2, 2022, at 12:57 PM, Paul E. McKenney <paulmck@xxxxxxxxxx> wrote:
> 
> On Sun, Oct 02, 2022 at 12:30:52PM -0400, Joel Fernandes wrote:
>> 
>> 
>>>> On Oct 2, 2022, at 12:24 PM, Paul E. McKenney <paulmck@xxxxxxxxxx> wrote:
>>> 
>>> On Sun, Oct 02, 2022 at 12:11:07PM -0400, Joel Fernandes wrote:
>>>> 
>>>> 
>>>>> On 10/2/2022 10:06 AM, Pingfan Liu wrote:
>>>>> On Fri, Sep 30, 2022 at 9:04 PM Joel Fernandes <joel@xxxxxxxxxxxxxxxxx> wrote:
>>>>>> 
>>>>>> On Thu, Sep 29, 2022 at 4:21 AM Pingfan Liu <kernelfans@xxxxxxxxx> wrote:
>>>>>>> 
>>>>>>> On Thu, Sep 29, 2022 at 4:19 PM Pingfan Liu <kernelfans@xxxxxxxxx> wrote:
>>>>>>>> 
>>>>>>> [...]
>>>>>>>> "
>>>>>>>> 
>>>>>>>> I have no idea whether this is related to the reverted commit.
>>>>>>>> 
>>>>>>> 
>>>>>>> I have started another test against clean v6.0-rc7 to see whether this
>>>>>>> is an issue with the mainline.
>>>>>> 
>>>>>> I am not sure what exactly you are reverting (you could clarify that),
>>>>> 
>>>>> commit 96926686deab ("rcu: Make CPU-hotplug removal operations enable tick").
>>>>> But due to conflict, "git revert" can not work directly. So I applied
>>>>> it by hand.
>>>>> 
>>>>>> but if you are just removing the entire TICK_DEP_BIT_RCU, I do
>>>>>> remember (and mentioned on IRC to others recently) that without this
>>>>>> NOHZ_FULL has a hard time ending grace-periods because the forcing of
>>>>>> tick is needed for this configuration if we are spinning in the kernel
>>>>>> with the tick turned off. That seems to align with your TREE04
>>>>>> (NOHZ_FULL) configuration.
>>>>>> 
>>>>> 
>>>>> Yes, that is the scenario.
>>>>> 
>>>>>> Also, the commit Frederic suggested to revert seems to be a cosmetic
>>>>>> optimization in the interrupt-entry path. That should not change
>>>>>> functionality I believe. So I did not fully follow why reverting that
>>>>>> is relevant (maybe Frederic can clarify?).
>>>>>> 
>>>>> 
>>>>> Leave this question to Frederic.
>>>> 
>>>> I take this comment back, Sorry. Indeed the commits Frederic mentioned will make
>>>> a functional change to CPU hotplug path.
>>>> 
>>>> Sorry for the noise.
>>>> 
>>>> Excited to see exact reason why TICK_DEP_BIT_RCU matters in the hotplug paths. I
>>>> might jump into the investigation with you guys, but I have to make time for
>>>> Lazy-RCU v7 next :)
>>> 
>>> One historical reason was that a nohz_full CPU could enter the kernel with
>>> the tick still disabled, and stay that way indefinitely.  Among other
>>> things, this can interfere with the grace-period wait in the offlining
>>> code path.
>> 
>> Thanks for clarification/confirmation, that was my hunch as well. Does the emergency forcing on of the tick not suffice for that? If I understand you are sending an IPI to set the same tick dependency if grace periods are being held too long. Another potential fix instead of setting tick dependency in hotplug, then, could be to reduce the time before which emergency IPIs are sent to free up nohz_full CPUs, I think.
> 
> Why not try modifying it to work this way and seeing what (if anything)
> breaks?  If nothing breaks, then there is still the option of digging
> into the code.

Yes happy to try :)

> 
> In both cases, after Lazy-RCU v7, of course.  ;-)

Sounds good and thanks for understanding ;)

Thanks,

 - Joel

> 
>                                Thanx, Paul




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux