On Sat, 2024-07-13 at 22:28 +0530, neeraj.upadhyay@xxxxxxxxxx wrote: > From: "Paul E. McKenney" <paulmck@xxxxxxxxxx> > > The CSD-lock-hold diagnostics from CONFIG_CSD_LOCK_WAIT_DEBUG are > printed in nanoseconds as unsigned long longs, which is a bit obtuse > for > human readers when timing bugs result in negative CSD-lock hold > times. > Yes, there are some people to whom it is immediately obvious that > 18446744073709551615 is really -1, but for the rest of us... > To clarify the report a little bit: it appears that, on some CPU models, occasionally sched_clock() values jump backward, on the same CPU. Looking at the number of systems where this happened over time, leaving out the exact numbers, the distribution looks something like this: - 1 day: N systems - 3 days: 3N systems - 1 week: 7N systems - 1 month: 26N systems - 90 days: 72N systems This does not appear to be a case of a few systems with bad hardware, where it happens constantly to the same systems, but something that many systems experience occasionally, and then not again for months. The systems in question advertise CONSTANT_TSC, NONSTOP_TSC, and generally seem to have stable, nonstop, monotonic TSC values, but sometimes the values go back in time a little bit. The cycles_2_ns data does not appear to change during the episodes of sched_clock() going backward. The csd_lock code is not the only thing that breaks when the sched_clock values go backward, but it seems to be the best thing we seem to have right now to detect it. I don't know whether adding more detection of this issue would increase the number of systems where backwards sched_clock is observed. Many of the systems with backwards going TSC values seem to encounter a bunch of them across some time period, end up getting rebooted, and then behave well for months after. > Reported-by: Rik van Riel <riel@xxxxxxxxxxx> > Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxx> > Cc: Imran Khan <imran.f.khan@xxxxxxxxxx> > Cc: Ingo Molnar <mingo@xxxxxxxxxx> > Cc: Leonardo Bras <leobras@xxxxxxxxxx> > Cc: "Peter Zijlstra (Intel)" <peterz@xxxxxxxxxxxxx> > Cc: Rik van Riel <riel@xxxxxxxxxxx> > Signed-off-by: Neeraj Upadhyay <neeraj.upadhyay@xxxxxxxxxx> > Reviewed-by: Rik van Riel <riel@xxxxxxxxxxx> -- All Rights Reversed.