On Wed, Feb 11, 2009 at 03:37:43PM +1100, Michael Ellerman wrote: > > Would it be fair to say that those platforms are the same ones which > > implement arch_msi_check_device(), which already works for this case? > > Yes and yes but .. > > We can do some checking in arch_msi_check_device() and return a > "could-be-allocated" value from there[1]. > > But we still want to be able to return a "could-be-allocated" value from > arch_setup_msi_irqs(), without going through the business of setting up > msi_descs. > > Simply because when we do the firmware call to actually allocate the > MSIs, it might give us less than we ask for. At that point we can just > return that number. Hm, yes, OK. > We /could/ try to allocate in arch_msi_check_device(), see what we get, > deallocate, and return the number. But it's a) ugly as hell, and b) > still possible that when we come back to do the real allocation we get a > different number. Yes. I also considered that you could actually do the allocation in arch_msi_check_device() and then fill in during arch_setup_msi_irqs(), but you could leak if, say, ioremap_nocache() fails. So, I'm OK with this patch, given the change I suggested below is implemented. Now, onto more fun matters ... I've been working on support for systems with large numbers of MSI-X vectors. Here's the architecture interface I'm currently working with: extern int arch_msi_check_device(struct pci_dev *, int nvec, int type); extern int arch_setup_msi_irqs(struct pci_dev *, int nvec); extern int arch_setup_msix_irq(struct pci_dev *, int irq); extern void arch_teardown_msi_irqs(struct pci_dev *); and the MSI-X code looks like this: for (i = 0; i < nvec; i++) { unsigned irq, table_entry = entries[i].entry; struct msi_desc desc; ret = arch_setup_msix_irq(dev, table_entry); I now perceive this would throw a bit of a spanner in the works for ppc. How about the following API? /* * Set up nvec MSI interrupts. Returns 0 on success, a positive number if * it could have set up fewer interrupts than requested, or -Errno for other * errors */ int arch_msi_setup_irqs(struct pci_dev *, int nvec); void arch_msi_free_irqs(struct pci_dev *); /* * Allocate nvec MSI-X interrupts. Returns 0 on success, a positive number if * it could have set up fewer interrupts than requested, or -Errno for other * errors */ int arch_msix_alloc_irqs(struct pci_dev *, int nvec); /* * Program the @table_entry MSI-X interrupt on the card. Returns the Linux * interrupt number on success or -Errno if an error occurs. */ int arch_msix_setup_irq(struct pci_dev *, int table_entry); void arch_msix_free_irqs(struct pci_dev *); There's a lot more I can say about the heavily patched msi.c file I have here ... but perhaps I'll save that for another day. > > > @@ -43,7 +43,8 @@ int arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int type) > > > list_for_each_entry(entry, &dev->msi_list, list) { > > > ret = arch_setup_msi_irq(dev, entry); > > > if (ret) > > > - return ret; > > > + /* Ensure the error code is always < 0 */ > > > + return min(ret, -ret); > > > } > > > > > > return 0; > > > > I really don't like this. If we're returning an error, we should return > > a sensible error value. Something like: > > > > if (ret < 0) > > return ret; > > if (ret > 0) > > return -ENOSPC; > > Fair enough. My reasoning was that negating it at least left some chance > that you could work out where it came from. But I'm not fussed, and I > admit it's a bit ugly. I have three points which contradict each other. Bear with me ;-) First, I just noticed that this was a patch to arch_setup_msi_irqs() so really we should return a count of how far we got through the list. Second, arch_setup_msi_irq() isn't permitted to return a number >0, so we don't really need this hunk at all. Third, I did push a git tree while on holiday that included eliminating arch_setup_msi_irqs() from this file and pushing it to all architectures. I told Jesse, but didn't send email about it (connectivity was intermittent and I couldn't really send email). I _intended_ to tell Jesse to pass it along, but clearly I didn't do that. Here's the git tree: http://git.kernel.org/?p=linux/kernel/git/willy/pci.git;a=shortlog;h=msi-20090126 There's several lines of development in there, so it's a bit messy. Oh well. -- Matthew Wilcox Intel Open Source Technology Centre "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html