On Wed, Oct 4, 2017 at 4:10 PM, Gabriel Beddingfield <gabe@xxxxxxxxxxxx> wrote: > Hi Thomas, > > Thanks for your reply! > > On Wed, Oct 4, 2017 at 11:22 AM, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote: >> Calling things a hack which might have been thought out carefully does not >> make people more receptive. > > My bad... sorry! You're right. The code in question is better than a "hack." > >>> Many ARM systems provide a "persistent clock." Most of them are backed >>> by a 32kHz clock that gives good precision and makes the delta_delta >>> hack unnecessary. However, devices that only have single-second >>> precision for the persistent clock and/or are forced to use the RTC >>> (whose API only allows for single-second precision) -- they still need >>> this hack. >> >> I have no idea what you are trying to tell me. We know that there are >> systems which have a clocksource which continues to tick across >> suspend/resume. > > I'm referring to read_persistent_clock64() API... which is distinct from the > CLOCK_SOURCE_SUSPEND_NONSTOP flag. > >> Such clocksources can be flagged with CLOCK_SOURCE_SUSPEND_NONSTOP and the >> timekeeping resume code uses them when available instead of using the >> RTC. There are a few of them flagged as such in the kernel, but it might > ... >> It's neither a problem of the timekeeping core code to select the >> appropriate clocksource for a system. That's up to the developer who >> implemented the SoC/CPU support and made a decision about rating the >> clocksource(s), which might end up selecting one which stops in suspend. > > This was our first approach. However, because of some hardware > limitations we couldn't > move the system's monotonic clock to the persistent clock. They had to > be two different > clocks. (Details below.) > >> Without looking at what it does, I can tell you that making it a config >> option is not going to fly. It has to be runtime discoverable as we have to >> support multi platform kernels. > > OK. Here's a couple ideas... > > APPROACH ONE: Use a heuristic > > If read_persistent_clock64() ever returns fractional seconds (as > apposed to whole seconds), then > permanently disable the compensation. This seems reasonable to me as well. >> Though I still want to know exactly why you think that you need some extra >> magic in the timekeeping core code. If your system has a clocksource which >> is not stopping on suspend and it lacks the flag, then this is a one liner >> patch. If there is something else, then please elaborate. > > Long story short: you can't always have your low-power clock be your > monotonic/sched > clock. > > The SoC we use backs the monotonic clock (sched_clock_register()) with > a counter that is > high frequency (>10 MHz) in their reference implementation. But it > does not count when the > system is in low-power mode. However, it can be configured to use a > 32kHz clock that *does* > count when the system is in low-power mode. So, we started by using > this clock and setting the > CLOCK_SOURCE_SUSPEND_NONSTOP flag. It worked great... at first. > > Then we found that devices would randomly experience a 36-hour time jump. > While we don't have a definitive root cause, the current theory is > that we are getting > non-atomic reads because the clock source isn't synchronized with the > the high frequency > clock (which is used for most of the digital logic on the SoC). > > Therefore, we moved the monotonic/sched clock back to the high-frequency source. > > Meanwhile, we can directly read the RTC clock on this system, and it > will give us 32kHz > resolution and also runs non-stop. Since reads are non-atomic, we have > to read the > registers in a loop. We used this register to implement > read_persistent_clock64(). > Because we have to read the registers in a loop, it seemed unfit for use as the > monotonic/sched clock. Yea. I thought arm devices often had read_persistent_clock64() backed by the 32k timer (which is poor for time initialization but works well for suspend timing). Maybe I misunderstood on the first read. Is it then that the relatively fine-grained read_persistent_clock64() is colliding with the delta_delta logic that assumes we get coarse 1sec resolution? In that case the huristic above seems sane. thanks -john