> From: Michael Kelley (LINUX) <mikelley@xxxxxxxxxxxxx> > Sent: Tuesday, May 23, 2023 10:14 AM > To: KY Srinivasan <kys@xxxxxxxxxxxxx>; Haiyang Zhang > <haiyangz@xxxxxxxxxxxxx>; wei.liu@xxxxxxxxxx; Dexuan Cui > <decui@xxxxxxxxxxxxx>; catalin.marinas@xxxxxxx; will@xxxxxxxxxx; > tglx@xxxxxxxxxxxxx; mingo@xxxxxxxxxx; bp@xxxxxxxxx; > dave.hansen@xxxxxxxxxxxxxxx; hpa@xxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; > linux-hyperv@xxxxxxxxxxxxxxx; linux-arm-kernel@xxxxxxxxxxxxxxxxxxx; > x86@xxxxxxxxxx > Cc: Michael Kelley (LINUX) <mikelley@xxxxxxxxxxxxx> > Subject: [PATCH v2 1/2] x86/hyperv: Fix hyperv_pcpu_input_arg handling > when CPUs go online/offline > > These commits > > a494aef23dfc ("PCI: hv: Replace retarget_msi_interrupt_params with > hyperv_pcpu_input_arg") > 2c6ba4216844 ("PCI: hv: Enable PCI pass-thru devices in Confidential VMs") > > update the Hyper-V virtual PCI driver to use the hyperv_pcpu_input_arg > because that memory will be correctly marked as decrypted or encrypted > for all VM types (CoCo or normal). But problems ensue when CPUs in the > VM go online or offline after virtual PCI devices have been configured. > > When a CPU is brought online, the hyperv_pcpu_input_arg for that CPU is > initialized by hv_cpu_init() running under state CPUHP_AP_ONLINE_DYN. > But this state occurs after state CPUHP_AP_IRQ_AFFINITY_ONLINE, which > may call the virtual PCI driver and fault trying to use the as yet > uninitialized hyperv_pcpu_input_arg. A similar problem occurs in a CoCo > VM if the MMIO read and write hypercalls are used from state > CPUHP_AP_IRQ_AFFINITY_ONLINE. > > When a CPU is taken offline, IRQs may be reassigned in state > CPUHP_TEARDOWN_CPU. Again, the virtual PCI driver may fault trying to > use the hyperv_pcpu_input_arg that has already been freed by a > higher state. > > Fix the onlining problem by adding state CPUHP_AP_HYPERV_ONLINE > immediately after CPUHP_AP_ONLINE_IDLE (similar to > CPUHP_AP_KVM_ONLINE) > and before CPUHP_AP_IRQ_AFFINITY_ONLINE. Use this new state for > Hyper-V initialization so that hyperv_pcpu_input_arg is allocated > early enough. > > Fix the offlining problem by not freeing hyperv_pcpu_input_arg when > a CPU goes offline. Retain the allocated memory, and reuse it if > the CPU comes back online later. > > Signed-off-by: Michael Kelley <mikelley@xxxxxxxxxxxxx> > --- > > Changes in v2: > * Put CPUHP_AP_HYPERV_ONLINE before CPUHP_AP_KVM_ONLINE [Vitaly > Kuznetsov] Reviewed-by: Dexuan Cui <decui@xxxxxxxxxxxxx>