Feng, On Thu, Nov 26 2020 at 09:24, Feng Tang wrote: > On Wed, Nov 25, 2020 at 01:46:23PM +0100, Thomas Gleixner wrote: >> Now the more interesting question is why this needs to be a PCI quirk in >> the first place. Can't we just disable the HPET based on family/model >> quirks? >> >> e0748539e3d5 ("x86/intel: Disable HPET on Intel Ice Lake platforms") >> f8edbde885bb ("x86/intel: Disable HPET on Intel Coffee Lake H platforms") >> fc5db58539b4 ("x86/quirks: Disable HPET on Intel Coffe Lake platforms") >> 62187910b0fc ("x86/intel: Add quirk to disable HPET for the Baytrail platform") > I added this commit, and I can explain some for Baytrail. There was > some discussion about the way to disable it: > https://lore.kernel.org/lkml/20140328073718.GA12762@feng-snb/t/ > > It uses PCI ID early quirk in the hope that later Baytrail stepping > doesn't have the problem. And later on, there was official document > (section 18.10.1.3 http://www.intel.com/content/dam/www/public/us/en/documents/datasheets/atom-z8000-datasheet-vol-1.pdf) > stating Baytrail's HPET halts in deep idle. So I think your way of > using CPUID to disable Baytrail HPET makes more sense. Correct. >> I might be missing something here, but in general on anything modern >> HPET is mostly useless. > > IIUC, nowdays HPET's main use is as a clocksource watchdog monitor. Plus for the TSC refined calibration, where it is really better than anything else we have _if_ it is functional. > And in one debug case, we found it still useful. The debug platform has > early serial console which prints many messages in early boot phase, > when the HPET is disabled, the software 'jiffies' clocksource will > be used as the monitor. Early printk will disable interrupt will > printing message, and this could be quite long for a slow 115200 > device, and cause the periodic HW timer interrupt get missed, and > make the 'jiffies' clocksource not accurate, which will in turn > judge the TSC clocksrouce inaccurate, and disablt it. (Adding Rui, > Len for more details) Yes, that can happen. But OTOH, we should start to think about the requirements for using the TSC watchdog. I'm inclined to lift that requirement when the CPU has: 1) X86_FEATURE_CONSTANT_TSC 2) X86_FEATURE_NONSTOP_TSC 3) X86_FEATURE_NONSTOP_TSC_S3 4) X86_FEATURE_TSC_ADJUST 5) At max. 4 sockets After two decades of horrors we're finally at a point where TSC seems to be halfways reliable and less abused by BIOS tinkerers. TSC_ADJUST was really key as we can now detect even small modifications reliably and the important point is that we can cure them as well (not pretty but better than all other options). The only nasty one in the list above is #5 because AFAIK there is still no architecural guarantee for TSCs being synchronized on machines with more than 4 sockets. I might be wrong, but then nobody told me. The only reason I hate to disable HPET upfront at least during boot is that HPET is the best mechanism for the refined TSC calibration. PMTIMER sucks because it's slow and wraps around pretty quick. So we could do the following even on platforms where HPET stops in some magic PC? state: - Register it during early boot as clocksource - Prevent the enablement as clockevent and the chardev hpet timer muck - Prevent the magic PC? state up to the point where the refined TSC calibration is finished. - Unregister it once the TSC has taken over as system clocksource and enable the magic PC? state in which HPET gets disfunctional. Hmm? Thanks, tglx