On Fri, May 29, 2020 at 2:21 PM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote: > > Miklos, > > Miklos Szeredi <miklos@xxxxxxxxxx> writes: > > On Fri, May 29, 2020 at 11:51 AM Miklos Szeredi <miklos@xxxxxxxxxx> wrote: > >> On Thu, May 28, 2020 at 10:43 PM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote: > >> > > >> > Miklos Szeredi <miklos@xxxxxxxxxx> writes: > >> > > Bisected it to: > >> > > > >> > > b95a8a27c300 ("x86/vdso: Use generic VDSO clock mode storage") > >> > > > >> > > The effect observed is that after the host is resumed, the clock in > >> > > the guest is somewhat in the future and is stopped. I.e. repeated > >> > > date(1) invocations show the same time. > >> > > >> > TBH, the bisect does not make any sense at all. It's renaming the > >> > constants and moving the storage space and I just read it line for line > >> > again that the result is equivalent. I'll have a look once the merge > >> > window dust settles a bit. > >> > >> Yet, reverting just that single commit against latest linus tree fixes > >> the issue. Which I think is a pretty good indication that that commit > >> *is* doing something. > > A revert on top of Linus latest surely does something, it disables VDSO > because clocksource.vdso_clock_mode becomes NONE. > > That's a data point maybe, but it clearly does not restore the situation > _before_ that commit. > > >> The jump forward is around 35 minutes; that seems to be consistent as > >> well. > > > > Oh, and here's a dmesg extract for the good case: > > > > [ 26.402239] clocksource: timekeeping watchdog on CPU0: Marking > > clocksource 'tsc' as unstable because the skew is too large: > > [ 26.407029] clocksource: 'kvm-clock' wd_now: > > 635480f3c wd_last: 3ce94a718 mask: ffffffffffffffff > > [ 26.407632] clocksource: 'tsc' cs_now: > > 92d2e5d08 cs_last: 81305ceee mask: ffffffffffffffff > > [ 26.409097] tsc: Marking TSC unstable due to clocksource watchdog > > > > and the bad one: > > > > [ 36.667576] clocksource: timekeeping watchdog on CPU1: Marking > > clocksource 'tsc' as unstable because the skew is too large: > > [ 36.690441] clocksource: 'kvm-clock' wd_now: > > 89885027c wd_last: 3ea987282 mask: ffffffffffffffff > > [ 36.690994] clocksource: 'tsc' cs_now: > > 95666ec22 cs_last: 84e747930 mask: ffffffffffffffff > > [ 36.691901] tsc: Marking TSC unstable due to clocksource watchdog > > And the difference is? It's 10 seconds later and the detection happens > on CPU1 and not on CPU0. I really don't see what you are reading out of > this. I didn't even try to interpret this. Just reporting what I'm seeing. > Can you please describe the setup of this test? > > - Host kernel version > - Guest kernel version > - Is the revert done on the host or guest or both? > - Test flow is: > > Boot host, start guest, suspend host, resume host, guest is screwed > > correct? Yep. Thanks, Miklos