Michael Kelley <mikelley@xxxxxxxxxxxxx> writes: > These commits > > a494aef23dfc ("PCI: hv: Replace retarget_msi_interrupt_params with hyperv_pcpu_input_arg") > 2c6ba4216844 ("PCI: hv: Enable PCI pass-thru devices in Confidential VMs") > > update the Hyper-V virtual PCI driver to use the hyperv_pcpu_input_arg > because that memory will be correctly marked as decrypted or encrypted > for all VM types (CoCo or normal). But problems ensue when CPUs in the > VM go online or offline after virtual PCI devices have been configured. > > When a CPU is brought online, the hyperv_pcpu_input_arg for that CPU is > initialized by hv_cpu_init() running under state CPUHP_AP_ONLINE_DYN. > But this state occurs after state CPUHP_AP_IRQ_AFFINITY_ONLINE, which > may call the virtual PCI driver and fault trying to use the as yet > uninitialized hyperv_pcpu_input_arg. A similar problem occurs in a CoCo > VM if the MMIO read and write hypercalls are used from state > CPUHP_AP_IRQ_AFFINITY_ONLINE. > > When a CPU is taken offline, IRQs may be reassigned in state > CPUHP_TEARDOWN_CPU. Again, the virtual PCI driver may fault trying to > use the hyperv_pcpu_input_arg that has already been freed by a > higher state. > > Fix the onlining problem by adding state CPUHP_AP_HYPERV_ONLINE > immediately after CPUHP_AP_ONLINE_IDLE (similar to CPUHP_AP_KVM_ONLINE) > and before CPUHP_AP_IRQ_AFFINITY_ONLINE. Use this new state for > Hyper-V initialization so that hyperv_pcpu_input_arg is allocated > early enough. > > Fix the offlining problem by not freeing hyperv_pcpu_input_arg when > a CPU goes offline. Retain the allocated memory, and reuse it if > the CPU comes back online later. > > Signed-off-by: Michael Kelley <mikelley@xxxxxxxxxxxxx> Very minor nitpick: I would've changed the Subject of the patch to something a bit more meaningful, e.g. "x86/hyperv: Avoid freeing per-CPU hyperv_pcpu_{input,output}_arg upon CPU offlining" or something like that. > --- > > Changes in v2: > * Put CPUHP_AP_HYPERV_ONLINE before CPUHP_AP_KVM_ONLINE [Vitaly > Kuznetsov] > > arch/x86/hyperv/hv_init.c | 2 +- > drivers/hv/hv_common.c | 48 +++++++++++++++++++++++----------------------- > include/linux/cpuhotplug.h | 1 + > 3 files changed, 26 insertions(+), 25 deletions(-) > > diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c > index a5f9474..6c04b52 100644 > --- a/arch/x86/hyperv/hv_init.c > +++ b/arch/x86/hyperv/hv_init.c > @@ -416,7 +416,7 @@ void __init hyperv_init(void) > goto free_vp_assist_page; > } > > - cpuhp = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "x86/hyperv_init:online", > + cpuhp = cpuhp_setup_state(CPUHP_AP_HYPERV_ONLINE, "x86/hyperv_init:online", > hv_cpu_init, hv_cpu_die); > if (cpuhp < 0) > goto free_ghcb_page; > diff --git a/drivers/hv/hv_common.c b/drivers/hv/hv_common.c > index 64f9cec..542a1d5 100644 > --- a/drivers/hv/hv_common.c > +++ b/drivers/hv/hv_common.c > @@ -364,13 +364,20 @@ int hv_common_cpu_init(unsigned int cpu) > flags = irqs_disabled() ? GFP_ATOMIC : GFP_KERNEL; > > inputarg = (void **)this_cpu_ptr(hyperv_pcpu_input_arg); > - *inputarg = kmalloc(pgcount * HV_HYP_PAGE_SIZE, flags); > - if (!(*inputarg)) > - return -ENOMEM; > > - if (hv_root_partition) { > - outputarg = (void **)this_cpu_ptr(hyperv_pcpu_output_arg); > - *outputarg = (char *)(*inputarg) + HV_HYP_PAGE_SIZE; > + /* > + * hyperv_pcpu_input_arg and hyperv_pcpu_output_arg memory is already > + * allocated if this CPU was previously online and then taken offline > + */ > + if (!*inputarg) { > + *inputarg = kmalloc(pgcount * HV_HYP_PAGE_SIZE, flags); Nitpick: I see this was in the original code but we could probably switch to kcalloc() here. > + if (!(*inputarg)) > + return -ENOMEM; > + > + if (hv_root_partition) { > + outputarg = (void **)this_cpu_ptr(hyperv_pcpu_output_arg); > + *outputarg = (char *)(*inputarg) + HV_HYP_PAGE_SIZE; > + } > } > > msr_vp_index = hv_get_register(HV_REGISTER_VP_INDEX); > @@ -385,24 +392,17 @@ int hv_common_cpu_init(unsigned int cpu) > > int hv_common_cpu_die(unsigned int cpu) > { > - unsigned long flags; > - void **inputarg, **outputarg; > - void *mem; > - > - local_irq_save(flags); > - > - inputarg = (void **)this_cpu_ptr(hyperv_pcpu_input_arg); > - mem = *inputarg; > - *inputarg = NULL; > - > - if (hv_root_partition) { > - outputarg = (void **)this_cpu_ptr(hyperv_pcpu_output_arg); > - *outputarg = NULL; > - } > - > - local_irq_restore(flags); > - > - kfree(mem); > + /* > + * The hyperv_pcpu_input_arg and hyperv_pcpu_output_arg memory > + * is not freed when the CPU goes offline as the hyperv_pcpu_input_arg > + * may be used by the Hyper-V vPCI driver in reassigning interrupts > + * as part of the offlining process. The interrupt reassignment > + * happens *after* the CPUHP_AP_HYPERV_ONLINE state has run and > + * called this function. > + * > + * If a previously offlined CPU is brought back online again, the > + * originally allocated memory is reused in hv_common_cpu_init(). > + */ > > return 0; > } > diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h > index 0f1001d..3ceb9df 100644 > --- a/include/linux/cpuhotplug.h > +++ b/include/linux/cpuhotplug.h > @@ -200,6 +200,7 @@ enum cpuhp_state { > > /* Online section invoked on the hotplugged CPU from the hotplug thread */ > CPUHP_AP_ONLINE_IDLE, > + CPUHP_AP_HYPERV_ONLINE, > CPUHP_AP_KVM_ONLINE, > CPUHP_AP_SCHED_WAIT_EMPTY, > CPUHP_AP_SMPBOOT_THREADS, Reviewed-by: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx> -- Vitaly