On Thu, Sep 16, 2021 at 08:30:42AM -0700, Jakub Kicinski wrote: > On Thu, 16 Sep 2021 10:07:07 -0500 Bjorn Helgaas wrote: > > On Thu, Sep 16, 2021 at 06:17:39AM -0700, Jakub Kicinski wrote: > > > My Lenovo T490s with i7-8665U had been marking TSC as unstable > > > since v5.13, resulting in very sluggish desktop experience... > > > > Including the actual dmesg log line here might help others locate this > > fix. > > Good point, will add in v2. > > clocksource: timekeeping watchdog on CPU3: hpet read-back delay of 316000ns, attempt 4, marking unstable > tsc: Marking TSC unstable due to clocksource watchdog > TSC found unstable after boot, most likely due to broken BIOS. Use 'tsc=unstable'. > sched_clock: Marking unstable (14539801827657, -530891666)<-(14539319241737, -48307500) > clocksource: Checking clocksource tsc synchronization from CPU 3 to CPUs 0-2,6-7. > clocksource: Switched to clocksource hpet > > > > I have a 8086:3e34 bridge, also known as "Host bridge: Intel > > > Corporation Coffee Lake HOST and DRAM Controller (rev 0c)". > > > Add it to the list. > > > > > > We should perhaps consider applying this quirk more widely. > > > The Intel documentation does not list my device [1], but > > > linuxhw [2] does, and it seems to list a few more bridges > > > we do not currently cover (3e31, 3ecc, 3e35, 3e0f). > > > > In the fine tradition of: > > > > e0748539e3d5 ("x86/intel: Disable HPET on Intel Ice Lake platforms") > > f8edbde885bb ("x86/intel: Disable HPET on Intel Coffee Lake H platforms") > > fc5db58539b4 ("x86/quirks: Disable HPET on Intel Coffe Lake platforms") > > 62187910b0fc ("x86/intel: Add quirk to disable HPET for the Baytrail plat form") > > > > This seems to be an ongoing issue, not just a point defect in a single > > product, and I really hate the onesy-twosy nature of this. > > Indeed. Or at least cover all Coffee Lakes in one fell swoop. > > > Is there really no way to detect this issue automatically or fix > > whatever Linux bug makes us trip over this? I am no clock expert, so > > I have absolutely no idea whether this is possible. > > I'm deferring to clock experts. Paul mentioned he has some prototype > patches that may help. > > > > [1] https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/8th-gen-core-family-datasheet-vol-2.pdf > > > [2] https://github.com/linuxhw/DevicePopulation/blob/master/README.md > > > > > > Cc: stable@xxxxxxxxxxxxxxx # v5.13+ > > > > How did you pick v5.13? force_disable_hpet() was added by > > 62187910b0fc ("x86/intel: Add quirk to disable HPET for the Baytrail > > platform"), which appeared in v3.15. > > Erm, good question, it started happening for me (and others with the > same laptop) with v5.13. I just sort of assumed it was 2e27e793e280 > ("clocksource: Reduce clocksource-skew threshold"). > > It usually takes a day to repro (4 hours was the quickest repro I've > seen) so bisection was kind of out of question. OK, so this is an intermittent condition where HPET is sometimes slow to access for a short period of time? If that is the case, my thought is to set the clocksource to be reinitialized (without a splat and without marking the clocksource unstable), and to splat (and mark the clocksource unstable) if it is not get a good read after 100 subsequent attempts. So as long as the period of slowness lasts for less than 50 seconds, things would work fine. Seem reasonable? Thanx, Paul