Hi John, On 03/09/2019 15:09, John Garry wrote: > Hi Marc, Bjorn, Thomas, > > We've come across a conflict with the kernel/pci msi code and GIC ITS > driver on our arm64 system, whereby we can't unbind and re-bind a PCI > device driver under special conditions. I'll explain... > > Our PCI device support 32 MSIs. The driver attempts to allocate msi > vectors with min msi=17, max msi = 32, and affd.pre vectors = 16. For > our test we make nr_cpus = 1 (just anything less than 16). Just to confirm: this PCI device is requiring Multi-MSI, right? As opposed to MSI-X? > We find that the pci/kernel msi code gives us 17 vectors, but the GIC > ITS code reserves 32 lpi maps in its_irq_domain_alloc(). The problem > then occurs when unbinding the driver in its_irq_domain_free() call, > where we only clear bits for 17 vectors. So if we unbind the driver and > then attempt to bind again, it fails. Is this device, by any chance, sharing its requested-id with another device? By being behind a bridge of some sort? There is some code to deal with it, but I'm not sure it has ever been verified in anger... > Where the fault lies, I can't say. Maybe the kernel msi code should > always give power of 2 vectors - as I understand, the PCI spec mandates > this. Or maybe the GIC ITS driver has a problem in the free path, as > above. Or maybe the PCI driver should not be allowed to request !power > of 2 min/max vectors. > > Opinion? My hunch is that it is an ITS driver bug: the PCI layer is allowed to give any number of MSIs to an endpoint driver, as long as they match the requirements of the allocation for Multi-MSI. That's the responsibility of the ITS driver. If unbind/bind fails, it means that somehow we've missed the freeing of the LPIs, which isn't good. Is the device common enough that I can try and reproduce the issue? If there's a Linux driver somewhere, I can always hack something in emulation and find out... Thanks, M. -- Jazz is not dead, it just smells funny...