Re: [bug report] KVM: SVM: Set target pCPU during IRTE update if target vCPU is running

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



+Suravee and Joerg

On Fri, Jul 19, 2024, Dan Carpenter wrote:
> Hello Sean Christopherson,
> 
> Commit f3cebc75e742 ("KVM: SVM: Set target pCPU during IRTE update if
> target vCPU is running") from Aug 8, 2023 (linux-next), leads to the
> following Smatch static checker warning:
> 
> 	arch/x86/kvm/svm/avic.c:841 svm_ir_list_add()
> 	error: we previously assumed 'pi->ir_data' could be null (see line 804)
> 
> arch/x86/kvm/svm/avic.c
>     792 static int svm_ir_list_add(struct vcpu_svm *svm, struct amd_iommu_pi_data *pi)
>     793 {
>     794         int ret = 0;
>     795         unsigned long flags;
>     796         struct amd_svm_iommu_ir *ir;
>     797         u64 entry;
>     798 
>     799         /**
>     800          * In some cases, the existing irte is updated and re-set,
>     801          * so we need to check here if it's already been * added
>     802          * to the ir_list.
>     803          */
>     804         if (pi->ir_data && (pi->prev_ga_tag != 0)) {
>                     ^^^^^^^^^^^
> The old code checks for NULL
> 
>     805                 struct kvm *kvm = svm->vcpu.kvm;
>     806                 u32 vcpu_id = AVIC_GATAG_TO_VCPUID(pi->prev_ga_tag);
>     807                 struct kvm_vcpu *prev_vcpu = kvm_get_vcpu_by_id(kvm, vcpu_id);
>     808                 struct vcpu_svm *prev_svm;
>     809 
>     810                 if (!prev_vcpu) {
>     811                         ret = -EINVAL;
>     812                         goto out;
>     813                 }
>     814 
>     815                 prev_svm = to_svm(prev_vcpu);
>     816                 svm_ir_list_del(prev_svm, pi);
>     817         }
>     818 
>     819         /**
>     820          * Allocating new amd_iommu_pi_data, which will get
>     821          * add to the per-vcpu ir_list.
>     822          */
>     823         ir = kzalloc(sizeof(struct amd_svm_iommu_ir), GFP_KERNEL_ACCOUNT);
>     824         if (!ir) {
>     825                 ret = -ENOMEM;
>     826                 goto out;
>     827         }
>     828         ir->data = pi->ir_data;
>     829 
>     830         spin_lock_irqsave(&svm->ir_list_lock, flags);
>     831 
>     832         /*
>     833          * Update the target pCPU for IOMMU doorbells if the vCPU is running.
>     834          * If the vCPU is NOT running, i.e. is blocking or scheduled out, KVM
>     835          * will update the pCPU info when the vCPU awkened and/or scheduled in.
>     836          * See also avic_vcpu_load().
>     837          */
>     838         entry = READ_ONCE(*(svm->avic_physical_id_cache));
>     839         if (entry & AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK)
>     840                 amd_iommu_update_ga(entry & AVIC_PHYSICAL_ID_ENTRY_HOST_PHYSICAL_ID_MASK,
> --> 841                                     true, pi->ir_data);
>                                                   ^^^^^^^^^^^
> The patch adds an unchecked dereference.  It could be a false positive if
> AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK implies that ->ir_data is non-NULL.  In
> that case could you just send an email saying "this is a false positive" and
> I'll ignore this warning going forward.

Hmm, yes and no.  AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK does kinda sorta gurantee
->ir_data is non-NULL, because avic_vcpu_load() will pass the same pointer to
amd_iommu_update_ga(), i.e. if ->ir_data is NULL, a crash would already have
happened.

However, I don't understand why _this_ code checks for a non-NULL pi->ir_data.
Or rather, I don't understand why that isn't considered a WARN-able failure.

Generally speaking, pi->ir_data is guaranteed to be valid, as the sole caller
checks that setting affinity succeeded:

			ret = irq_set_vcpu_affinity(host_irq, &pi);
			if (!ret && pi.is_guest_mode)
				svm_ir_list_add(svm, &pi);

but there is one path in amd_ir_set_vcpu_affinity() when it returns "success"
but doesn't set ir_data.

  static int amd_ir_set_vcpu_affinity(struct irq_data *data, void *vcpu_info)
  {
	int ret;
	struct amd_iommu_pi_data *pi_data = vcpu_info;
	struct vcpu_data *vcpu_pi_info = pi_data->vcpu_data;
	struct amd_ir_data *ir_data = data->chip_data;
	struct irq_2_irte *irte_info = &ir_data->irq_2_irte;
	struct iommu_dev_data *dev_data;

	if (ir_data->iommu == NULL)
		return -EINVAL;

	dev_data = search_dev_data(ir_data->iommu, irte_info->devid);

	/* Note:
	 * This device has never been set up for guest mode.
	 * we should not modify the IRTE
	 */
	if (!dev_data || !dev_data->use_vapic) <===
		return 0;

	ir_data->cfg = irqd_cfg(data);
	pi_data->ir_data = ir_data;


That seems like it should be an error.  And then KVM should WARN if svm_ir_list_add()
is called with a NULL pi->ir_data.

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index b19e8c0f48fa..e08d28f133d3 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -3687,7 +3687,7 @@ static int amd_ir_set_vcpu_affinity(struct irq_data *data, void *vcpu_info)
         * we should not modify the IRTE
         */
        if (!dev_data || !dev_data->use_vapic)
-               return 0;
+               return -EINVAL;
 
        ir_data->cfg = irqd_cfg(data);
        pi_data->ir_data = ir_data;




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux