Hi Thomas, On Mon, Nov 30, 2020 at 08:21:03PM +0100, Thomas Gleixner wrote: > Feng, > > On Fri, Nov 27 2020 at 14:11, Feng Tang wrote: > > On Fri, Nov 27, 2020 at 12:27:34AM +0100, Thomas Gleixner wrote: > >> On Thu, Nov 26 2020 at 09:24, Feng Tang wrote: > >> Yes, that can happen. But OTOH, we should start to think about the > >> requirements for using the TSC watchdog. > >> > >> I'm inclined to lift that requirement when the CPU has: > >> > >> 1) X86_FEATURE_CONSTANT_TSC > >> 2) X86_FEATURE_NONSTOP_TSC > > > >> 3) X86_FEATURE_NONSTOP_TSC_S3 > > IIUC, this feature exists for several generations of Atom platforms, > > and it is always coupled with 1) and 2), so it could be skipped for > > the checking. > > Yes, we can ignore that bit as it's not widely available and not > required to solve the problem. > > >> 4) X86_FEATURE_TSC_ADJUST > >> > >> 5) At max. 4 sockets > >> > >> The only reason I hate to disable HPET upfront at least during boot is > >> that HPET is the best mechanism for the refined TSC calibration. PMTIMER > >> sucks because it's slow and wraps around pretty quick. > >> > >> So we could do the following even on platforms where HPET stops in some > >> magic PC? state: > >> > >> - Register it during early boot as clocksource > >> > >> - Prevent the enablement as clockevent and the chardev hpet timer muck > >> > >> - Prevent the magic PC? state up to the point where the refined > >> TSC calibration is finished. > >> > >> - Unregister it once the TSC has taken over as system clocksource and > >> enable the magic PC? state in which HPET gets disfunctional. > > > > This looks reasonable to me. > > > > I have thought about lowering the hpet rating to lower than PMTIMER, so it > > still contributes in early boot phase, and fades out after PMTIMER is > > initialised. > > Not a good idea. pm_timer is initialized before the refined calibration > finishes. Yes, you are right. I missed the part. I dug some old notes, and found another old case (kernel 3.4) that a broken PMTIMER as the watchdog clocksource wrongly judged TSC to be unstable and disabled it, which makes me agree more to the idea of "lift that requirement when the CPU has ..." If the TSC has those bits to garantee its reliability, then no need to use a less reliable clocksource to monitor it. Thanks, Feng