On Fri, Sep 17, 2021 at 11:11:49AM +0200, Peter Zijlstra wrote: > On Thu, Sep 16, 2021 at 10:07:07AM -0500, Bjorn Helgaas wrote: > > This seems to be an ongoing issue, not just a point defect in a single > > product, and I really hate the onesy-twosy nature of this. Is there > > really no way to detect this issue automatically or fix whatever Linux > > bug makes us trip over this? I am no clock expert, so I have > > absolutely no idea whether this is possible. > > X86 is gifted with the grant total of _0_ reliable clocks. Given no > accurate time, it is impossible to tell which one of them is broken > worst. Although I suppose we could attempt to synchronize against the > PMU or MPERF.. > > We could possibly disable the tsc watchdog for > X86_FEATURE_TSC_KNOWN_FREQ && X86_FEATURE_TSC_ADJUST I suppose. > > And then have people with 'creative' BIOS get to keep the pieces. Alternatively, we can change what the TSC watchdog does for X86_FEATURE_TSC_ADJUST machines. Instead of checking time against HPET it can check if TSC_ADJUST changes. That should make it more resillient vs HPET time itself being off.