Hi Thomas, On Thu, Oct 5, 2017 at 11:01 AM, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote: >> > Which SoC/clocksource driver are you talking about? >> >> NXP i.MX 6SoloX >> drivers/clocksource/timer-imx-gpt.c > > So that clocksource driver looks correct. Do you have an idea in which > context this time jump happens? Does it happen when you exercise your high > frequency suspend/resume dance or is that happening just when you let the > machine run forever as well? We couldn't devise any reproduction steps. We observed it happening at unexpected times in a fleet of devices -- and we couldn't find any patterns to clue us in. > > The timekeeping_resume() path definitely has an issue: > > cycle_now = tk_clock_read(&tk->tkr_mono); > if ((clock->flags & CLOCK_SOURCE_SUSPEND_NONSTOP) && > cycle_now > tk->tkr_mono.cycle_last) { > > This works nice for clocksources which wont wrap across suspend/resume but > not for those which can. That cycle_now -> cycle_last check should take > cs-mask into account ... > > Of course for clocksources which can wrap within realistic suspend times, > which 36 hours might be accounted for, this would need an extra sanity > check against a RTC whether wrap time has been exceeded. > > I haven't thought it through whether that buggered check fully explains > what you are observing, but it's wrong nevertheless. John? Nah. It looks like the consequence is that you'll either fail to inject the sleep time or you'll fall back to having the RTC inject the sleep time. In our case, we never sleep for more than a couple of minutes so the error would be seconds rather than hours. -gabe