On Thu, Apr 16, 2020 at 11:29:59PM -0700, Dexuan Cui wrote: > Unlike the other CPUs, CPU0 is never offlined during hibernation. So in the > resume path, the "new" kernel's VP assist page is not suspended (i.e. > disabled), and later when we jump to the "old" kernel, the page is not > properly re-enabled for CPU0 with the allocated page from the old kernel. > > So far, the VP assist page is only used by hv_apic_eoi_write(). When the > page is not properly re-enabled, hvp->apic_assist is always 0, so the > HV_X64_MSR_EOI MSR is always written. This is not ideal with respect to > performance, but Hyper-V can still correctly handle this. > > The issue is: the hypervisor can corrupt the old kernel memory, and hence > sometimes cause unexpected behaviors, e.g. when the old kernel's non-boot > CPUs are being onlined in the resume path, the VM can hang or be killed > due to virtual triple fault. I don't quite follow here. The first sentence is rather alarming -- why would Hyper-V corrupt guest's memory (kernel or not)? Secondly, code below only specifies cpu0. What does it do with non-boot cpus on the resume path? Wei. > > Fix the issue by calling hv_cpu_die()/hv_cpu_init() in the syscore ops. > > Without the fix, hibernation can fail at a rate of 1/300 ~ 1/500. > With the fix, hibernation can pass a long-haul test of 2000 rounds. > > Fixes: 05bd330a7fd8 ("x86/hyperv: Suspend/resume the hypercall page for hibernation") > Cc: stable@xxxxxxxxxxxxxxx > Signed-off-by: Dexuan Cui <decui@xxxxxxxxxxxxx> > --- > arch/x86/hyperv/hv_init.c | 12 ++++++++++-- > 1 file changed, 10 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c > index b0da5320bcff..4d3ce86331a3 100644 > --- a/arch/x86/hyperv/hv_init.c > +++ b/arch/x86/hyperv/hv_init.c > @@ -72,7 +72,8 @@ static int hv_cpu_init(unsigned int cpu) > struct page *pg; > > input_arg = (void **)this_cpu_ptr(hyperv_pcpu_input_arg); > - pg = alloc_page(GFP_KERNEL); > + /* hv_cpu_init() can be called with IRQs disabled from hv_resume() */ > + pg = alloc_page(GFP_ATOMIC); > if (unlikely(!pg)) > return -ENOMEM; > *input_arg = page_address(pg); > @@ -253,6 +254,7 @@ static int __init hv_pci_init(void) > static int hv_suspend(void) > { > union hv_x64_msr_hypercall_contents hypercall_msr; > + int ret; > > /* > * Reset the hypercall page as it is going to be invalidated > @@ -269,12 +271,17 @@ static int hv_suspend(void) > hypercall_msr.enable = 0; > wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64); > > - return 0; > + ret = hv_cpu_die(0); > + return ret; > } > > static void hv_resume(void) > { > union hv_x64_msr_hypercall_contents hypercall_msr; > + int ret; > + > + ret = hv_cpu_init(0); > + WARN_ON(ret); > > /* Re-enable the hypercall page */ > rdmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64); > @@ -287,6 +294,7 @@ static void hv_resume(void) > hv_hypercall_pg_saved = NULL; > } > > +/* Note: when the ops are called, only CPU0 is online and IRQs are disabled. */ > static struct syscore_ops hv_syscore_ops = { > .suspend = hv_suspend, > .resume = hv_resume, > -- > 2.19.1 >