Changes since v1: - Invalidate TSC page from kvm_gen_update_masterclock() instead of calling kvm_hv_setup_tsc_page() for all vCPUs [Paolo] - Set hv->hv_tsc_page_status = HV_TSC_PAGE_UNSET when TSC page is disabled with MSR write. Check both HV_TSC_PAGE_BROKEN/HV_TSC_PAGE_UNSET states in kvm_hv_setup_tsc_page()/kvm_hv_invalidate_tsc_page(). - Check for HV_TSC_PAGE_SET state instead of '!hv->tsc_ref.tsc_sequence' in get_time_ref_counter(). Original description: I'm investigating an issue when Linux guest on nested Hyper-V on KVM (WSL2 on Win10 on KVM to be precise) hangs after L1 KVM is migrated. Trace shows us that L2 is trying to set L1's Synthetic Timer and reacting to this Hyper-V sets Synthetic Timer in KVM but the target value it sets is always slightly in the past, this causes the timer to expire immediately and an interrupt storm is thus observed. L2 is not making much forward progress. The issue is only observed when re-enlightenment is exposed to L1. KVM doesn't really support re-enlightenment notifications upon migration, userspace is supposed to expose it only when TSC scaling is supported on the destination host. Without re-enlightenment exposed, Hyper-V will not expose stable TSC page clocksource to its L2s. The issue is observed when migration happens between hosts supporting TSC scaling. Rumor has it that it is possible to reproduce the problem even when migrating locally to the same host, though, I wasn't really able to. The current speculation is that when Hyper-V is migrated, it uses stale (cached) TSC page values to compute the difference between its own clocksource (provided by KVM) and its guests' TSC pages to program synthetic timers and in some cases, when TSC page is updated, this puts all stimer expirations in the past. This, in its turn, causes an interrupt storms (both L0-L1 and L1->L2 as Hyper-V mirrors stimer expirations into L2). The proposed fix is to skip updating TSC page clocksource when guest opted for re-enlightenment notifications (PATCH4). Patches 1-3 are slightly related fixes to the (mostly theoretical) issues I've stumbled upon while working on the problem. Vitaly Kuznetsov (4): KVM: x86: hyper-v: Limit guest to writing zero to HV_X64_MSR_TSC_EMULATION_STATUS KVM: x86: hyper-v: Prevent using not-yet-updated TSC page by secondary CPUs KVM: x86: hyper-v: Track Hyper-V TSC page status KVM: x86: hyper-v: Don't touch TSC page values when guest opted for re-enlightenment arch/x86/include/asm/kvm_host.h | 10 ++++ arch/x86/kvm/hyperv.c | 91 +++++++++++++++++++++++++++++---- arch/x86/kvm/hyperv.h | 1 + arch/x86/kvm/x86.c | 2 + 4 files changed, 94 insertions(+), 10 deletions(-) -- 2.30.2