From: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx> Sent: Thursday, November 3, 2022 12:06 PM > > Commit e5d9b714fe40 ("x86/hyperv: fix root partition faults when writing > to VP assist page MSR") moved 'wrmsrl(HV_X64_MSR_VP_ASSIST_PAGE)' under > 'if (*hvp)' condition. This works for root partition as hv_cpu_die() > does memunmap() and sets 'hv_vp_assist_page[cpu]' to NULL but breaks > non-root partitions as hv_cpu_die() doesn't free 'hv_vp_assist_page[cpu]' > for them. Did you consider having hv_cpu_die() free the VP assist page and set hv_vp_assist_page[cpu] to NULL in the non-root case? That would make the root and non-root cases more consistent, and it would make hv_cpu_init() and hv_cpu_die() more symmetrical. The hv_cpu_die() path frees up pretty much all the other per-CPU resources. I don't know why it keeps the VP assist page for re-use if the CPU comes back online later. You added the original code for allocating the vp_assist_page in commit a46d15cc1a, so maybe you remember if there are any gotchas. :-) Michael > This causes VP assist page to remain unset after CPU > offline/online cycle: > > $ rdmsr -p 24 0x40000073 > 10212f001 > $ echo 0 > /sys/devices/system/cpu/cpu24/online > $ echo 1 > /sys/devices/system/cpu/cpu24/online > $ rdmsr -p 24 0x40000073 > 0 > > Fix the issue by always writing to HV_X64_MSR_VP_ASSIST_PAGE in > hv_cpu_init(). Note, checking 'if (!*hvp)', for root partition is > pointless as hv_cpu_die() always sets 'hv_vp_assist_page[cpu]' to > NULL (and it's also NULL initially). > > Note: the fact that 'hv_vp_assist_page[cpu]' is reset to NULL may > present a (potential) issue for KVM. While Hyper-V uses > CPUHP_AP_ONLINE_DYN stage in CPU hotplug, KVM uses > CPUHP_AP_KVM_STARTING > which comes earlier in CPU teardown sequence. It is theoretically > possible that Enlightened VMCS is still in use. It is unclear if the > issue is real and if using KVM with Hyper-V root partition is even > possible. > > While on it, drop the unneeded smp_processor_id() call from hv_cpu_init(). > > Fixes: e5d9b714fe40 ("x86/hyperv: fix root partition faults when writing to VP assist > page MSR") > Signed-off-by: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx> > --- > arch/x86/hyperv/hv_init.c | 54 +++++++++++++++++++-------------------- > 1 file changed, 26 insertions(+), 28 deletions(-) > > diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c > index f49bc3ec76e6..a269049a43ce 100644 > --- a/arch/x86/hyperv/hv_init.c > +++ b/arch/x86/hyperv/hv_init.c > @@ -77,7 +77,7 @@ static int hyperv_init_ghcb(void) > static int hv_cpu_init(unsigned int cpu) > { > union hv_vp_assist_msr_contents msr = { 0 }; > - struct hv_vp_assist_page **hvp = &hv_vp_assist_page[smp_processor_id()]; > + struct hv_vp_assist_page **hvp = &hv_vp_assist_page[cpu]; > int ret; > > ret = hv_common_cpu_init(cpu); > @@ -87,34 +87,32 @@ static int hv_cpu_init(unsigned int cpu) > if (!hv_vp_assist_page) > return 0; > > - if (!*hvp) { > - if (hv_root_partition) { > - /* > - * For root partition we get the hypervisor provided VP assist > - * page, instead of allocating a new page. > - */ > - rdmsrl(HV_X64_MSR_VP_ASSIST_PAGE, msr.as_uint64); > - *hvp = memremap(msr.pfn << > - > HV_X64_MSR_VP_ASSIST_PAGE_ADDRESS_SHIFT, > - PAGE_SIZE, MEMREMAP_WB); > - } else { > - /* > - * The VP assist page is an "overlay" page (see Hyper-V TLFS's > - * Section 5.2.1 "GPA Overlay Pages"). Here it must be zeroed > - * out to make sure we always write the EOI MSR in > - * hv_apic_eoi_write() *after* the EOI optimization is disabled > - * in hv_cpu_die(), otherwise a CPU may not be stopped in the > - * case of CPU offlining and the VM will hang. > - */ > + if (hv_root_partition) { > + /* > + * For root partition we get the hypervisor provided VP assist > + * page, instead of allocating a new page. > + */ > + rdmsrl(HV_X64_MSR_VP_ASSIST_PAGE, msr.as_uint64); > + *hvp = memremap(msr.pfn << > HV_X64_MSR_VP_ASSIST_PAGE_ADDRESS_SHIFT, > + PAGE_SIZE, MEMREMAP_WB); > + } else { > + /* > + * The VP assist page is an "overlay" page (see Hyper-V TLFS's > + * Section 5.2.1 "GPA Overlay Pages"). Here it must be zeroed > + * out to make sure we always write the EOI MSR in > + * hv_apic_eoi_write() *after* the EOI optimization is disabled > + * in hv_cpu_die(), otherwise a CPU may not be stopped in the > + * case of CPU offlining and the VM will hang. > + */ > + if (!*hvp) > *hvp = __vmalloc(PAGE_SIZE, GFP_KERNEL | __GFP_ZERO); > - if (*hvp) > - msr.pfn = vmalloc_to_pfn(*hvp); > - } > - WARN_ON(!(*hvp)); > - if (*hvp) { > - msr.enable = 1; > - wrmsrl(HV_X64_MSR_VP_ASSIST_PAGE, msr.as_uint64); > - } > + if (*hvp) > + msr.pfn = vmalloc_to_pfn(*hvp); > + > + } > + if (!WARN_ON(!(*hvp))) { > + msr.enable = 1; > + wrmsrl(HV_X64_MSR_VP_ASSIST_PAGE, msr.as_uint64); > } > > return hyperv_init_ghcb(); > -- > 2.38.1