Hi Alex,Jan, I forgot mention that in case of MSI-X failure I do not see any interrupts being allocated by the Host (kvm module). grep kvm /proc/interrupts is empty. -Shashidhar On Sat, Jan 14, 2012 at 12:16 PM, Shashidhar Patil <shashidhar.patil@xxxxxxxxx> wrote: > Hi Alex,Jan, > I collected logs of pci updates processing of kvm(attached to this mail). > (I will try your suggestion soon) > > The below source of Linux kernel shows the msix allocation done with > MSIX_ENABLE_FLAG > masked which works fine with kvm. > > static int msix_capability_init(struct pci_dev *dev, > struct msix_entry *entries, int nvec) > { > int pos, ret; > u16 control; > void __iomem *base; > > pos = pci_find_capability(dev, PCI_CAP_ID_MSIX); > pci_read_config_word(dev, pos + PCI_MSIX_FLAGS, &control); > > /* Ensure MSI-X is disabled while it is set up */ > control &= ~PCI_MSIX_FLAGS_ENABLE; > pci_write_config_word(dev, pos + PCI_MSIX_FLAGS, control); > > /* Request & Map MSI-X table region */ > base = msix_map_region(dev, pos, multi_msix_capable(control)); > if (!base) > return -ENOMEM; > > ret = msix_setup_entries(dev, pos, base, entries, nvec); > if (ret) > return ret; > > ret = arch_setup_msi_irqs(dev, nvec, PCI_CAP_ID_MSIX); > if (ret) > /* > * Some devices require MSI-X to be enabled before we can touch the > * MSI-X registers. We need to mask all the vectors to prevent > * interrupts coming in before they're fully set up. > */ > control |= PCI_MSIX_FLAGS_MASKALL | PCI_MSIX_FLAGS_ENABLE; > pci_write_config_word(dev, pos + PCI_MSIX_FLAGS, control); > > On Sat, Jan 14, 2012 at 3:45 AM, Jan Kiszka <jan.kiszka@xxxxxx> wrote: >> On 2012-01-13 22:56, Alex Williamson wrote: >>> On Fri, 2012-01-13 at 22:33 +0100, Jan Kiszka wrote: >>>> On 2012-01-13 22:05, Alex Williamson wrote: >>>>> On Fri, 2012-01-13 at 22:00 +0100, Jan Kiszka wrote: >>>>>> On 2012-01-04 04:21, Alex Williamson wrote: >>>>>>> On Mon, 2011-12-19 at 19:49 +0530, Shashidhar Patil wrote: >>>>>>>> Hi, >>>>>>>> I am running Ubuntu 10.10 (amd64) on a 2 socket nehalem based >>>>>>>> server with IOH 5520. 5520 supports VTD. >>>>>>>> I enabled DMAR with intel_iommu=on. The box has intel 82599 adapter >>>>>>>> which I assigned through VT-D to FreeBSD 8.2 running >>>>>>>> as guest os. The ixgbe driver detects the device and the driver >>>>>>>> successfully configures the device. But the link >>>>>>>> never comes up. It looks like link up/down interrupts are not >>>>>>>> delivered. Then I checked kvm interrupt assignment and as expected >>>>>>>> kvm could not make MSI-X entries for the VT-d guest. So no output from >>>>>>>> "grep kvm /proc/interrupt". By enabling some debugs in the >>>>>>>> qemu-kvm I figured out that the MSI-x updates are not received >>>>>>>> properly. It does look like Linux updates MSI-X table in a batch >>>>>>>> fashion >>>>>>>> which qemu-kvm gets in one shot and every thing works fine in case of >>>>>>>> linux. In case of FreeBSD PCIE updates come /MSI-X entry >>>>>>>> which qemu-kvm can't make use. >>>>>>> >>>>>>> That's right, Linux and Windows both seem to setup the MSI-X table then >>>>>>> enable it in one shot, so we only trigger the interrupt programming when >>>>>>> the enable bit is set. We don't trigger changes on writes to the MSI-X >>>>>>> table... not very accurate emulation of mask bits. >>>>>> >>>>>> According to the PCI spec, updates that happen while a vector is >>>>>> unmasked, need not be considered by the hardware (thus the hypervisor >>>>>> here). Is that the scenario here? >>>>> >>>>> I'm assuming the vector is masked in the MSI-X table. So Linux/Windows >>>>> do: >>>>> >>>>> a) program MSI-X table >>>>> b) enable MSI-X in capability register >>>>> >>>>> Whereas FreeBSD does: >>>>> >>>>> a) enable MSI-X in capability register (vectors masked in table) >>>>> b) program and unmask individual vectors >>>> >>>> That should work with the current code. It checks the number of vectors >>>> on each config write, iterates the whole table, and then updates the >>> ^^^^^^^^^^^^^^^^^^^^ >>>> kernel configuration accordingly. It even requires the enable bit in the >>>> cap register to be set before doing this. >>> >>> That's the problem, we only do it on config writes overlapping the MSI-X >>> flags. We don't do anything for writes to the MSI-X table. It might be >>> as simple as calling assigned_dev_update_msix() from msix_mmio_writel() >>> when the mask bit is toggled. I'm not sure what might fall out of that >>> though. >> >> Ah indeed. Now I recall to have fixed this in my MSI-X refactoring >> series. I introduced config notifiers that are triggered by the MSI-X >> layer on every relevant modification, and the device assignment code >> hook the update function into this. I really need to dig into that >> series soon again and refresh it. >> >> In the meantime, we could try what you suggest (if the cap enable bit is >> set). >> >> Jan >> -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html