On 8/14/2013 2:17 PM, John Stultz wrote: > So a long while back we had tried to adapt for clock frequency changes > on things like the TSC, but it resulting in *terrible* timekeeping as > the latency between the frequency change and the handling of the > notifications caused lots of clock drift, making it impossible for NTP > or other synchronization methods to work properly. We've done quite a bit of testing to show that our current implementation doesn't have any clock drift over time. Basically, we take a machine running some workload, sync its time via ntpdate, and then run a script that changes the CPU speed up or down continually, with a delay of a couple seconds in between so we run for some decent amount of time at each speed. Every 5 minutes or so, the script runs ntpdate -q to see what the offset from real time is. The skew we see doing that for a couple of days is identical to that seen when we _aren't_ changing the CPU frequency. A key part of making this work, as noted in the comments at the head of timekeeping_chfreq_prep(), is the fact that we do the frequency change under stop_machine() to make sure that no CPU gets an opportunity to sample the clock while it's being changed. However, I'm wondering whether you're talking about some other sort of much more local clock skew or other frequency effect that perhaps we haven't tested for. (For instance, we haven't actually run this code on an NTP server.) Can you give a bit more detail on exactly what sorts of bad behavior you saw with the previous implementation, and things one might do to detect them? > So early on we made > a requirement that all clocksources have a constant frequency and > provided a way to disqualify any clocksources that change frequency. > > So I'd be very hesitant to try to add any such behavior into the > timekeeping core. You may want to try to add some logic in the > clocksource driver itself to allow for the variable freq clocksource > to output what seems to be a fixed freq, So, just to be clear, you're suggesting that we claim our clocksource runs at some lower virtual speed (say, 1 MHz), and that internally to our clocksource drver we divide down the real frequency to the virtual one? > and if we get some time on it > to prove that it can be made to work well, then we can see about > making it more generic. > > Does that sound ok? That seems possible, although it would seem to make the whole process a bit less efficient (e.g., our clocksource will have to maintain its own multiplier and offset to convert from real ticks to virtual ticks, and then the core code will do the same operation again to convert to wall-clock time). Obviously, we're not really anxious to re-test/re-qualify a new implementation of this, but if our current version is or might be incompatible with other code in the kernel perhaps that's a safer approach. What sort of eventual more-generic support were you thinking of? -- Chris Metcalf, Tilera Corp. http://www.tilera.com -- To unsubscribe from this list: send the line "unsubscribe cpufreq" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html