RE: [PATCH] x86/hyperv: Fix kexec panic/hang issues

Michael Kelley <mikelley@xxxxxxxxxxxxx> · Mon, 21 Dec 2020 23:33:13 +0000

From: Dexuan Cui <decui@xxxxxxxxxxxxx> Sent: Sunday, December 20, 2020 2:30 PM
> 
> Currently the kexec kernel can panic or hang due to 2 causes:
> 
> 1) hv_cpu_die() is not called upon kexec, so the hypervisor corrupts the
> VP Assist Pages when the kexec kernel runs. We ever fixed the same issue

Spurious word "ever".  And avoid first person "we".  Perhaps:

     The same issue is fixed for hibernation in commit ..... .  Now fix it for kexec.

> for hibernation in
> commit 421f090c819d ("x86/hyperv: Suspend/resume the VP assist page for hibernation")
> and now let's fix it for kexec.

Is the VP Assist Page getting cleared anywhere on the panic path?  We can
only do it for the CPU that runs panic(), but I don't think it is getting cleared
even for that CPU.   It is cleared only in hv_cpu_die(), and that's not called on
the panic path.

> 
> 2) hyperv_cleanup() is called too early. In the kexec path, the other CPUs
> are stopped in hv_machine_shutdown() -> native_machine_shutdown(), so
> between hv_kexec_handler() and native_machine_shutdown(), the other CPUs
> can still try to access the hypercall page and cause panic. The workaround
> "hv_hypercall_pg = NULL;" in hyperv_cleanup() can't work reliably.

I would note that the comment in hv_suspend() is also incorrect on this
point.  Setting hv_hypercall_pg to NULL does not cause subsequent
hypercalls to fail safely.  The fast hypercalls don't test for it, and even if they
did test like hv_do_hypercall(), the test just creates a race condition.

> Move hyperv_cleanup() to the right place.
> 
> Signed-off-by: Dexuan Cui <decui@xxxxxxxxxxxxx>
> ---
>  arch/x86/hyperv/hv_init.c       |  4 ++++
>  arch/x86/include/asm/mshyperv.h |  2 ++
>  arch/x86/kernel/cpu/mshyperv.c  | 18 ++++++++++++++++++
>  drivers/hv/vmbus_drv.c          |  2 --
>  4 files changed, 24 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
> index e04d90af4c27..4638a52d8eae 100644
> --- a/arch/x86/hyperv/hv_init.c
> +++ b/arch/x86/hyperv/hv_init.c
> @@ -16,6 +16,7 @@
>  #include <asm/hyperv-tlfs.h>
>  #include <asm/mshyperv.h>
>  #include <asm/idtentry.h>
> +#include <linux/kexec.h>
>  #include <linux/version.h>
>  #include <linux/vmalloc.h>
>  #include <linux/mm.h>
> @@ -26,6 +27,8 @@
>  #include <linux/syscore_ops.h>
>  #include <clocksource/hyperv_timer.h>
> 
> +int hyperv_init_cpuhp;
> +
>  void *hv_hypercall_pg;
>  EXPORT_SYMBOL_GPL(hv_hypercall_pg);
> 
> @@ -401,6 +404,7 @@ void __init hyperv_init(void)
> 
>  	register_syscore_ops(&hv_syscore_ops);
> 
> +	hyperv_init_cpuhp = cpuhp;
>  	return;
> 
>  remove_cpuhp_state:
> diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
> index ffc289992d1b..30f76b966857 100644
> --- a/arch/x86/include/asm/mshyperv.h
> +++ b/arch/x86/include/asm/mshyperv.h
> @@ -74,6 +74,8 @@ static inline void hv_disable_stimer0_percpu_irq(int irq) {}
> 
> 
>  #if IS_ENABLED(CONFIG_HYPERV)
> +extern int hyperv_init_cpuhp;
> +
>  extern void *hv_hypercall_pg;
>  extern void  __percpu  **hyperv_pcpu_input_arg;
> 
> diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
> index f628e3dc150f..43b54bef5448 100644
> --- a/arch/x86/kernel/cpu/mshyperv.c
> +++ b/arch/x86/kernel/cpu/mshyperv.c
> @@ -135,14 +135,32 @@ static void hv_machine_shutdown(void)
>  {
>  	if (kexec_in_progress && hv_kexec_handler)
>  		hv_kexec_handler();
> +
> +	/*
> +	 * Call hv_cpu_die() on all the CPUs, otherwise later the hypervisor
> +	 * corrupts the old VP Assist Pages and can crash the kexec kernel.
> +	 */
> +	if (kexec_in_progress && hyperv_init_cpuhp > 0)
> +		cpuhp_remove_state(hyperv_init_cpuhp);
> +
> +	/* The function calls stop_other_cpus(). */
>  	native_machine_shutdown();
> +
> +	/* Disable the hypercall page when there is only 1 active CPU. */
> +	if (kexec_in_progress)
> +		hyperv_cleanup();
>  }
> 
>  static void hv_machine_crash_shutdown(struct pt_regs *regs)
>  {
>  	if (hv_crash_handler)
>  		hv_crash_handler(regs);
> +
> +	/* The function calls crash_smp_send_stop(). */

Actually, crash_smp_send_stop() or smp_send_stop() has already been
called earlier by panic(), so there's already only a single CPU running at
this point.  crash_smp_send_stop() is called again in
native_machine_crash_shutdown(), but it has a flag to prevent it from
running again.

>  	native_machine_crash_shutdown(regs);
> +
> +	/* Disable the hypercall page when there is only 1 active CPU. */
> +	hyperv_cleanup();

Moving the call to hyperv_cleanup() in the panic path is OK, and it makes
the panic and kexec() paths more similar, but I don't think it is necessary.
As noted above, the other CPUs have already been stopped, so they shouldn't
be executing any hypercalls.  

>  }
>  #endif /* CONFIG_KEXEC_CORE */
>  #endif /* CONFIG_HYPERV */
> diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
> index 502f8cd95f6d..d491fdcee61f 100644
> --- a/drivers/hv/vmbus_drv.c
> +++ b/drivers/hv/vmbus_drv.c
> @@ -2550,7 +2550,6 @@ static void hv_kexec_handler(void)
>  	/* Make sure conn_state is set as hv_synic_cleanup checks for it */
>  	mb();
>  	cpuhp_remove_state(hyperv_cpuhp_online);
> -	hyperv_cleanup();
>  };
> 
>  static void hv_crash_handler(struct pt_regs *regs)
> @@ -2566,7 +2565,6 @@ static void hv_crash_handler(struct pt_regs *regs)
>  	cpu = smp_processor_id();
>  	hv_stimer_cleanup(cpu);
>  	hv_synic_disable_regs(cpu);
> -	hyperv_cleanup();
>  };
> 
>  static int hv_synic_suspend(void)
> --
> 2.19.1