From: Dexuan Cui <decui@xxxxxxxxxxxxx> Sent: Monday, December 21, 2020 5:04 PM > > > From: Michael Kelley > > Sent: Monday, December 21, 2020 3:33 PM > > From: Dexuan Cui > > Sent: Sunday, December 20, 2020 2:30 PM > > > > > > Currently the kexec kernel can panic or hang due to 2 causes: > > > > > > 1) hv_cpu_die() is not called upon kexec, so the hypervisor corrupts the > > > VP Assist Pages when the kexec kernel runs. We ever fixed the same issue > > > > Spurious word "ever". And avoid first person "we". Perhaps: > > > > The same issue is fixed for hibernation in commit ..... . Now fix it for > > kexec. > > Thanks! Will use this in v2. > > > > for hibernation in > > > commit 421f090c819d ("x86/hyperv: Suspend/resume the VP assist page for > > hibernation") > > > and now let's fix it for kexec. > > > > Is the VP Assist Page getting cleared anywhere on the panic path? We can > > It's not. > > > only do it for the CPU that runs panic(), but I don't think it is getting cleared > > even for that CPU. It is cleared only in hv_cpu_die(), and that's not called on > > the panic path. > > IMO kdump is different from the non-kdump kexec in that the kdump kernel > runs without depending on the memory used by the first kernel, so it looks > unnecessary to clear the first kernel's VP Assist Page (and the hypercallpage). > According to my test, the second kernel can re-enble the VP Asist Page and > the hypercall page using different GPAs, without disabling the old pages first. Ah yes. You are right. The kdump kernel must be using a disjoint area of physical memory, so not clearing these per-CPU overlay pages shouldn't put the kdump kernel at risk. > Of course, in the future Hyper-V may require the guest to disable the pages first > before trying to re-enabling them, so I agree we'd better clear the pages in the > first kernell like this: > > diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c > index 4638a52d8eae..8022f51c9c05 100644 > --- a/arch/x86/hyperv/hv_init.c > +++ b/arch/x86/hyperv/hv_init.c > @@ -202,7 +202,7 @@ void clear_hv_tscchange_cb(void) > } > EXPORT_SYMBOL_GPL(clear_hv_tscchange_cb); > > -static int hv_cpu_die(unsigned int cpu) > +int hv_cpu_die(unsigned int cpu) > { > struct hv_reenlightenment_control re_ctrl; > unsigned int new_cpu; > diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h > index 30f76b966857..d090e781d216 100644 > --- a/arch/x86/include/asm/mshyperv.h > +++ b/arch/x86/include/asm/mshyperv.h > @@ -76,6 +76,8 @@ static inline void hv_disable_stimer0_percpu_irq(int irq) {} > #if IS_ENABLED(CONFIG_HYPERV) > extern int hyperv_init_cpuhp; > > +int hv_cpu_die(unsigned int cpu); > + > extern void *hv_hypercall_pg; > extern void __percpu **hyperv_pcpu_input_arg; > > diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c > index 43b54bef5448..e54f8262bfe0 100644 > --- a/arch/x86/kernel/cpu/mshyperv.c > +++ b/arch/x86/kernel/cpu/mshyperv.c > @@ -156,6 +156,9 @@ static void hv_machine_crash_shutdown(struct pt_regs *regs) > if (hv_crash_handler) > hv_crash_handler(regs); > > + /* Only call this on the faulting CPU. */ > + hv_cpu_die(raw_smp_processor_id()); > + > /* The function calls crash_smp_send_stop(). */ > native_machine_crash_shutdown(regs); Since we don't *need* to do this, I think it might be less risky to just leave the code "as is". But I'm OK either way. > > > > 2) hyperv_cleanup() is called too early. In the kexec path, the other CPUs > > > are stopped in hv_machine_shutdown() -> native_machine_shutdown(), so > > > between hv_kexec_handler() and native_machine_shutdown(), the other > > CPUs > > > can still try to access the hypercall page and cause panic. The workaround > > > "hv_hypercall_pg = NULL;" in hyperv_cleanup() can't work reliably. > > > > I would note that the comment in hv_suspend() is also incorrect on this > > point. Setting hv_hypercall_pg to NULL does not cause subsequent > > hypercalls to fail safely. The fast hypercalls don't test for it, and even if they > > did test like hv_do_hypercall(), the test just creates a race condition. > > The comment in hv_suspend() should be correct because hv_suspend() > is only called during hibernation from the syscore_ops path where only > one CPU is active, e.g. for the suspend operation, it's called from > state_store > hibernate > hibernation_snapshot > create_image > suspend_disable_secondary_cpus > syscore_suspend > hv_suspend > > It's similar for the resume operation: > resume_store > software_resume > load_image_and_restore > hibernation_restore > resume_target_kernel > hibernate_resume_nonboot_cpu_disable > syscore_suspend > hv_suspend I agree the hv_suspend() code is correct. I read the second sentence of the comment as being a more general statement that hypercalls could be cleanly stopped by setting hv_hypercall_pg to NULL, which isn't true. > > > > static void hv_machine_crash_shutdown(struct pt_regs *regs) > > > { > > > if (hv_crash_handler) > > > hv_crash_handler(regs); > > > + > > > + /* The function calls crash_smp_send_stop(). */ > > > > Actually, crash_smp_send_stop() or smp_send_stop() has already been > > called earlier by panic(), > > This is true only when the Hyper-V host supports the feature > HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE. On an old Hyper-V host > without the feature, ms_hyperv_init_platform() doesn't set > crash_kexec_post_notifiers, so crash_kexec_post_notifiers keeps its > initial value "false", and panic() calls smp_send_stop() *after* > __crash_kexec() (which calls machine_crash_shutdown() -> > hv_machine_crash_shutdown()). OK, I see your point. > > > so there's already only a single CPU running at > > this point. crash_smp_send_stop() is called again in > > native_machine_crash_shutdown(), but it has a flag to prevent it from > > running again. > > > > > native_machine_crash_shutdown(regs); > > > + > > > + /* Disable the hypercall page when there is only 1 active CPU. */ > > > + hyperv_cleanup(); > > > > Moving the call to hyperv_cleanup() in the panic path is OK, and it makes > > the panic and kexec() paths more similar, but I don't think it is necessary. > > As noted above, the other CPUs have already been stopped, so they shouldn't > > be executing any hypercalls. > > As I explained above, it's necessary for old Hyper-V hosts. :-) Agreed.