Re: [PATCH v1] iommu/amd: Don't block updates to GATag if guest mode is already on

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Missing Signed-off-by.

Also adding Vasant from AMD for review.

On Wed, Feb 08, 2023 at 01:19:38PM +0000, Joao Martins wrote:
> On KVM GSI routing table updates, specially those where they have vIOMMUs
> with interrupt remapping enabled (e.g. to boot >255vcpus guests without
> relying on KVM_FEATURE_MSI_EXT_DEST_ID), a VMM may update the backing VF
> MSIs with new VCPU affinities.
> 
> On AMD this translates to calls to amd_ir_set_vcpu_affinity() and
> eventually to amd_iommu_{de}activate_guest_mode() with a new GATag
> outlining the VM ID and (new) VCPU ID. On vCPU blocking and unblocking
> paths it disables AVIC, and rely on GALog to convey the wakeups to any
> sleeping vCPUs. KVM will store a list of GA-mode IR entries to each
> running/blocked vCPU. So any vCPU Affinity update to a VF interrupt happen
> via KVM, and it will change already-configured-guest-mode IRTEs with a new
> GATag.
> 
> The issue is that amd_iommu_activate_guest_mode() will essentially only
> change IRTE fields on transitions from non-guest-mode to guest-mode and
> otherwise returns *with no changes to IRTE* on already configured
> guest-mode interrupts. To the guest this means that the VF interrupts
> remain affined to the first vCPU these were first configured, and guest
> will be unable to either VF interrupts and receive messages like this from
> spurious interrupts (e.g. from waking the wrong vCPU in GALog):
> 
> [  167.759472] __common_interrupt: 3.34 No irq handler for vector
> [  230.680927] mlx5_core 0000:00:02.0: mlx5_cmd_eq_recover:247:(pid
> 3122): Recovered 1 EQEs on cmd_eq
> [  230.681799] mlx5_core 0000:00:02.0:
> wait_func_handle_exec_timeout:1113:(pid 3122): cmd[0]: CREATE_CQ(0x400)
> recovered after timeout
> [  230.683266] __common_interrupt: 3.34 No irq handler for vector
> 
> Given that amd_ir_set_vcpu_affinity() uses amd_iommu_activate_guest_mode()
> underneath it essentially means that VCPU affinity changes of IRTEs are
> nops if it was called once for the IRTE already (on VMENTER). Fix it by
> dropping the check for guest-mode at amd_iommu_activate_guest_mode().  Same
> thing is applicable to amd_iommu_deactivate_guest_mode() although, even if
> the IRTE doesn't change underlying DestID on the host, the VFIO IRQ handler
> will still be able to poke at the right guest-vCPU.
> 
> Fixes: b9c6ff94e43a ("iommu/amd: Re-factor guest virtual APIC (de-)activation code")
> Signed-off-by: Joao Martins <joao.m.martins@xxxxxxxxxx>
> ---
> Some notes in other related flaws as I looked at this:
> 
> 1) amd_iommu_deactivate_guest_mode() suffers from the same issue as this patch,
> but it should only matter for the case where you rely on irqbalance-like
> daemons balancing VFIO IRQs in the hypervisor. Though, it doesn't translate
> into guest failures, more like performance "misdirection". Happy to fix it, if
> folks also deem it as a problem.
> 
> 2) This patch doesn't attempt at changing semantics around what
> amd_iommu_activate_guest_mode() has been doing for a long time [since v5.4]
> (i.e. clear the whole IRTE and then changes its fields). As such when
> updating the IRTEs the interrupts get isRunning and DestId cleared, thus
> we rely on the GALog to inject IRQs into vCPUs /until/ the vCPUs block
> and unblock again (which is when they update the IOMMU affinity), or the
> AVIC gets momentarily disabled. I have patches that improve this part as a
> follow-up, but I thought that this patch had value on its own onto fixing
> what has been broken since v5.4 ... and that it could be easily carried
> to stable trees.
> 
> ---
>  drivers/iommu/amd/iommu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
> index cbeaab55c0db..afe1f35a4dd9 100644
> --- a/drivers/iommu/amd/iommu.c
> +++ b/drivers/iommu/amd/iommu.c
> @@ -3476,7 +3476,7 @@ int amd_iommu_activate_guest_mode(void *data)
>  	u64 valid;
>  
>  	if (!AMD_IOMMU_GUEST_IR_VAPIC(amd_iommu_guest_ir) ||
> -	    !entry || entry->lo.fields_vapic.guest_mode)
> +	    !entry)
>  		return 0;
>  
>  	valid = entry->lo.fields_vapic.valid;
> -- 
> 2.17.2
> 



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux