On Thu, Jul 03, 2008 at 01:24:29PM +1000, Benjamin Herrenschmidt wrote: ... > > Next, MSI requires that you assign a block of interrupts that is a power > > of two in size (between 2^0 and 2^5), and aligned to at least that power > > of two. ... > > One thing I do want to be clear in the API is that the driver can ask > > for any number of irqs, the pci layer will round up to the next power of > > two if necessary. > > Well, that's where I'm not happy. The API shouldn't expose the > "power-of-two" thing. The numbers shown to drivers aren't in the same > space as the source numbers as seen by the HW on many architectures and > thus don't need to have the same constraints. The drivers have to deal with the limitations of the HW spec. In this case it means they have to know they are getting power of 2 number of interrupts. I think exposing this in the API is a requirement and not optional. > > I don't quite understand how IRQ affinity will work yet. Is it feasible > > to redirect one interrupt from a block to a different CPU? I don't even > > understand this on x86-64, let alone the other four architectures. I'm > > OK with forcing all MSIs in the same block to move with the one that was > > assigned a new affinity if that's the way it has to be done. > > It's very implementation specific. IE. On most powerpc implementations, > MSI just route via a decoder to sources of the existing interrupt > controller so we can control per-source affinity at that level. > Some x86 seem to require different base addresses which makes it mostly > impossible to spread them I believe (maybe that's why people came up > with MSI-X ?) Correct. MSI only has one address for multiple vectors and thus will only target one CPU. MSI-X has address/vector pairs (1:1). If the Local-APICs are able to redirect interrupts, then multiple CPUs can process the interrupts. I expect this "HW Interrupt redirection" is what the PCI committee expected to be used...however HP (and perhaps others) have HW which didn't implement "XTP" register (IIRC, that's the register required to redirect interrupts by the Local-APIC) since one gets better performance by "targeting" interrupts at specific CPUs. hth, grant -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html