Re: pci_enable_msix() fails with ENOMEM/EINVAL

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,
I was advised to turn off irqbalance and reproduced this issue, but
the failure is in a different place now. Now request_threaded_irq()
fails with EBUSY.
According to the code, this can only happen on the path:
request_threaded_irq() -> __setup_irq()
Now in setup irq, the only place where EBUSY can show up for us is here:
...
	raw_spin_lock_irqsave(&desc->lock, flags);
	old_ptr = &desc->action;
	old = *old_ptr;
	if (old) {
		/*
		 * Can't share interrupts unless both agree to and are
		 * the same type (level, edge, polarity). So both flag
		 * fields must have IRQF_SHARED set and the bits which
		 * set the trigger type must match. Also all must
		 * agree on ONESHOT.
		 */
		if (!((old->flags & new->flags) & IRQF_SHARED) ||
		    ((old->flags ^ new->flags) & IRQF_TRIGGER_MASK) ||
		    ((old->flags ^ new->flags) & IRQF_ONESHOT)) {
			old_name = old->name;
			goto mismatch;
		}

		/* All handlers must agree on per-cpuness */
		if ((old->flags & IRQF_PERCPU) !=
		    (new->flags & IRQF_PERCPU))
			goto mismatch;

KVM calls request_threaded_irq() with flags==0, so can it be that
different KVM processes request the same IRQ? How different KVM
processes spawned simultaneously agree between them on IRQ numbers?

Can anybody of KVM developers please look at this and comment, and not
ignore this email?

This starts looking like a real KVM issue.

Thanks,
Alex.


On Mon, Nov 19, 2012 at 5:18 PM, Alex Lyakas <alex@xxxxxxxxxxxxxxxxx> wrote:
> Greetings all,
> I am running Ubuntu-Precise 3.2.0-29-generic #46, with stock KVM ("QEMU
> emulator version 1.0 (qemu-kvm-1.0)") on a Dell R510 server. I have one
> dual-port Intel's NIC 82599, of which I spawn 32 VFs from each port. I spawn
> virtual machines with KVM, each VM has 4 VFs attached (two from each PF).
>
> Once in a while, in particular when I spawn multiple VMs in parallel, I hit
> an issue that one of the VFs does not have an IRQ assigned to it. I am
> checking this in /proc/interrupts, looking for entries like
> "kvm:0000:03:14.6". In some cases, an entry is missing for a particular VF.
> As a result, the VF within the VM is non-functional.
>
> I debugged this issue further, by adding prints to kvm.ko code. I see that
> the failure happens in kvm_vm_ioctl_assigned_device/KVM_ASSIGN_DEV_IRQ path,
> which calls assigned_device_enable_host_msix() function, which calls
> pci_enable_msix(), which fails with EINVAL or with ENOMEM. This path is
> called twice for each VF.
>
> For the ENOMEM failure, I see that first pci_enable_msix() returns -12, and
> when kvm_vm_ioctl_set_msix_nr() is called again, it sees that
> adev->entries_nr != 0 and fails the call with EINVAL.
>
> I can repro it only when spawning like 8 or 10 VMs in parallel, but it
> doesn't happen every time. So it seems like this is not a resource shortage
> problem, but some race somewhere.
>
> I tested this with several version of ixgbe drivers, including the in-tree
> version that comes with Precise. It reproduces with all the versions.
>
> Can anybody advise on how to proceed debugging this issue?
>
> Thanks,
> Alex.
>
>
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux