Re: The rdp->gpwrap behavior

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Feb 14, 2025 at 01:03:40PM -0500, Joel Fernandes wrote:
> Hi Paul,
> 
> On 2/14/2025 3:20 AM, Paul E. McKenney wrote:
> > On Thu, Feb 13, 2025 at 01:30:33PM -0500, Joel Fernandes wrote:
> >> +rcu for archives
> >>
> >> Hi,
> >>
> >> Following up about our gpwrap discussions, I did some tracing of rdp->gpwrap
> >> with just boot testing.
> >>
> >> I actually never see it set because rdp->gp_seq and rnp->gp_seq are usually
> >> very close.
> >>
> >> Example trace:
> >>
> >> # rnp->gp_seq starts with -1200 on boot and then right around the wrap:
> >>
> >> 178.096329: rcu_gpnum_ovf: CPU: 2, rnp->gp_seq: -3 (0xfffffffffffffffd), rdp->gp_seq: -3 (0xfffffffffffffffd), wrap before: 0
> >> 178.096330: rcu_gpnum_ovf: CPU: 2, rnp->gp_seq: -3 (0xfffffffffffffffd), rdp->gp_seq: -3 (0xfffffffffffffffd), wrap after: 0
> >> 178.100327: rcu_gpnum_ovf: CPU: 2, rnp->gp_seq: 1 (0x1), rdp->gp_seq: 1 (0x1), wrap before: 0
> >> 178.100328: rcu_gpnum_ovf: CPU: 2, rnp->gp_seq: 1 (0x1), rdp->gp_seq: 1 (0x1), wrap after: 0
> >>
> >> The wrap "before" and "after" are the value of gpwrap before and after
> >> calling rcu_gpnum_ovf().
> >>
> >> Closer look at the _ovf() shows it should be only set if rdp->gp_seq and
> >> rnp->gp_seq are quite a distance apart (say if the CPU was idle for too long
> >> and many GPs have passed. This can happen both with/without wrap. So imminent
> >> wrapping seems not necessary for ->gpwrap to be even set AFAICS.
> >>
> >> I think the problem is ULONG_CMP_LT is not really "<" so even after wrap, it
> >> can false. i.e if rdp->gpseq + ULONG/4 wraps, it could still return false.
> > 
> > Exactly, and all of that is in fact the intended behavior.
> 
> Ah I think I get it now, so the wrap isn't really about the counter wrapping
> causing an issue, but about the design of the ULONG_CMP APIs which are designed
> for comparisons that wrap around. Due to this design, there is a side effect
> that the numbers being compared have to be at a distance that is less than half
> the number wheel. Otherwise the comparisons of numbers yields incorrect results.
> Is that accurate to say?  IOW, ->gpwrap == true means, don't compare the numbers
> anymore till we sort these huge deltas out.

But "incorrect" is such a harsh and judgmental word.  ;-)

Somewhat more seriously, ULONG_CMP_{GE,LT}() are defined to return the
sign (or its complement) of differences between a pair of 32-bit or 64-bit
numbers, as sizeof(unsigned long) dictates.  This is, mathematically,
how you do signed differences.  The result coming out negative isn't
incorrect, but instead the expected result given the numbers in question.

Current RCU use cases might interpret a negative result as "no one did
it for us yet", as in rcu_barrier(), "this CPU has not completed the
current grace period", as in __note_gp_changes(), and so on.

Now, these can get a false-positive/-negative result.  In the case of
rcu_barrier(), we get a needless rcu_barrier() computation.  In the case
of __note_gp_changes(), the false negative is OK, but a false positive
from wrapping all the way around, could be fatal for a CPU that was idle
for a long time (hence ->gpwrap).  And so on.

> >> >From a testing POV, the example shows it is not set even when around when a
> >> wrap actually happened.  So may be, we are not testing gpwrap much, or at
> >> all, with rcutorture?
> > 
> > True enough, and that would be well worth fixing.
> > 
> > One simple (and maybe too simple) fix is to add a Kconfig option to
> > kernel/rcu/Kconfig.debug that makes rcu_gpnum_ovf() use something like
> > (32 << RCU_SEQ_CTR_SHIFT).  This, or some similar value, should cause
> > rcutorture's stuttering and/or shuffling to trigger setting of ->gpwrap
> > within a few minutes.
> > 
> > And I would not be surprised if doing this located bugs.
> 
> Ok cool! So I will work on this next then. I am curious to see how often it
> triggers both with/without such a change.

If you can make it trigger without this change, you are far more patient
than I am.  ;-)

> >> But before that, I am feeling it is not behaving correctly. I'd expect it to
> >> be set even if rnp and rdp values are close but we are actually about to
> >> wrap.  So most likely I am missing something.
> > 
> > If I understand correctly, you would like ->gpwrap to be set
> > unconditionally at various points in the cycle, for example, at zero,
> > ULONG_MAX/4, ULONG_MAX/2, and 3*ULONG_MAX/4.  But this could easily
> > have the effect of discarding each CPU's quiescent states that showed
> > up before the first call to rcu_watching_snap_recheck(), even if that
> > CPU was active.  Our users might not consider this to be a friendly act.
> 
> Got it.
> 
> > In contrast, the current setup only discards quiescent states for CPUs
> > that just came back from idle prior to call to rcu_watching_snap_save().
> 
> Do you mean the check of ->gpwrap in rcu_report_qs_rdp() which returns early?

Well, it also resets the CPU to collect quiescent states for the current
grace period, but yes, it does return early.

> > In addition, use of the aforementioned Kconfig option has the benefit of
> > making any ->gpwrap-related bugs appear during testing, not in production.
> > On the same grounds, one could argue that absence of this Kconfig option
> > should also cause gp_seq to zero rather than the current;
> > 
> > 	.gp_seq = (0UL - 300UL) << RCU_SEQ_CTR_SHIFT,
> 
> Hm, I think it still make sense to the do -300 even without debug options, just
> so we have a true wrap early on, regardless of which APIs we use for comparison.

It has certainly been as it is for a long time, so agreed.

							Thanx, Paul




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux