> Sure. I know by the end of my patchset I have separated out hw numbers > from the linux irq numbers, so this should work for powerpc. Almost :) I have to look in more details but we have more nasty issues... Some of the things I need for powerpc is: - On Hypervisor machines, the firmware does it all... that is, it allocates hardware vectors, enables the stuff on the device, etc etc... all of it. The only step we might still add on top of it is mapping those hw vectors to linux irq numbers. (Note that my new core stuff for powerpc that I'll post has explcit mecanisms for doing just that ... mapping hw interrupts/controller pairs to linux irq numbers) - We need control on what is in the irq_desc->chip and irq_desc->chip->ops of MSIs. That should be filled by something in msi_ops and not always set to the functions that muck with the config space etc... (same goes for affinity, no need for an msi_ops callback for that, just let us set what is in the irq_desc->chip and possibly provide "helpers" that do the config space way that intel can use there). For all of these, on both the above (hypervisor), but also MPIC based platforms like the G5 etc..., we need to use the normal IRQ ops of the PIC for enable/disable/affinity, etc... - msi_ops shall be per PCI bus. We can have completely different MSI handling hardware on separate busses of a given machine. For example, an Apple quad G5 has a register that triggers any source on the main MPIC on the primary PCIe segment (the one used by the gfx card), and has a completely separate HT2000 bridge on hypertransport that generates HT interrupts. Those are some of the basic requirements. Plus of course it has to fit in what I'm currently coding of course :) > I would love to hear feedback on it though. > > So to be very clear what we mean, because I have gotten bitten in the > past. I understand the linux irq number to be: > a) An index in the irq_desc array. Yes. > b) An enumeration of the hardware interrupts sources. Hrm... > c) Human visible so ideally it is neither arbitrary, nor > very dynamic if the hardware is not. Hrm... I have neither b) nor c) nowadays on powerpc.... "linux" irq numbers are purely a virtual thing, that is an index in irq_desc array and something we give to drivers to do request_irq() from. They can map onto hw interrupts, MSI-like messages, environment interrupts, could be hypervisor messgaes, in fact, it could be anything that remotely looks like an interrupt and the concept of "hw vector" is very blurry here... every interrupt controller defines it's own hardware vector space. On pSeries, hardware vectors are fairly big numbers that can encode the geographical location of the slot where the device is connected to, on some other hypervisor, they are 64 bits "tokens" representing an hypervisor object that can send events, etc etc.... My remapper _tries_ to assign a linux irq number that is equal to the hardware number whenever possible (because it's indeed nicer that way) but there is no hard requirement nor anything like that here. I might even add a hook to /proc/interrupt to be able to display the HW infos for each interrupts. > Then there is the destination cookie (vector on x86) that is > available to the cpu when the interrupt is delivered. Not sure what you mean by "destination cookie". I suppose you talk about the hw vector. That is, whatever hardware (or hypervisor) number/object represents a given interrupt source. I call that the hardware number. > I think we are on a similar track but I'm not at all certain I like > the idea of a fully dynamic linux irq number except in cases like MSI > where your sources are dynamic. But I may be making the wrong > assumptions about what you are doing. I'm doing a fully dynamic numbering. However, on a machine with a static hardware setup (like a powermac with no MSIs), you'll always end up with the same virtual numbers after boot and they'll happen to match the hardware source numbers because I'm nice and my remapper "tries" to allocate the same number if possible. > I think implementations where we expose the hardware cookie instead > of an enumeration of irq sources like ia64 does, impedes debugging > because the irq number will be different between boots, or loads > of the kernel module. At the same time as long as we don't assume > the irq number is the hardware cookie I don't see any maintenance > problems with an implementation like that. I'm not sure I understood completely your above sentence but if you mean that doing full virtual makes debugging harder... well.. it might not help _some_ classes of problem, but mostly it can be solved by having a hook to display the HW infos on the same line in /proc/interrupts. My remapper also reserved linux numbers 0...15 to map then 1:1 to legacy interrupts when a 8259 is present in the machine and keeeps them reserved (un-requestable) if not. That avoids tons of problems with legacy drivers loading on machines like powermac with no legacy stuff. > What I do know is that on x86_64 the multiple levels of translation > from the firmwares notion of the irq number to linux different > notion of the irq number made the code overly complex and fragile. > Which is one of the things I address in this patchset. It was a bit dodgy on powerpc too, which is why I'm doing a nice layer to handle it in a way that should be solid enough. A hardware number can be anything in the context of a given controller (I call it "host", bad name maybe, could be "domain" instead, it's a bit different than the "chip" exposed by genirq because a given hw numbering domain may span several chips on some archs and a given hw controller may use several chip structures for different classes of interrupts). Essentially, when you "discover" an interrupt and needs to map it to a linux interrupt (for example, when you discover a PCI device and need to fill pci_dev->irq, or whatever else), you call irq_create_mapping(host, hw_number, trigger). I also provide additional helpers that automatically give you those 3 informations from the firmware for various sort of devices, that sort of thing. > But I would be happy to have a look, and I very much > hope we can our implementations working together. I'm trying to split the giant patch so I can post only the bit that adds the new stuff :) Due to the way I work, I always first remove the old stuff so that things don't build unless I've fixed them all but that sucks for posting patch so I need ot do a bit of cleanup :) Ben. - To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html