On Wed, Nov 16 2022 at 14:36, Jason Gunthorpe wrote: > On Fri, Nov 11, 2022 at 02:56:50PM +0100, Thomas Gleixner wrote: >> To support multiple MSI interrupt domains per device it is necessary to >> segment the xarray MSI descriptor storage. Each domain gets up to >> MSI_MAX_INDEX entries. > > This kinds of suggests that the new per-device MSI domains should hold > this storage instead of per-device xarray? No, really not. This would create random storage in random driver places instead of having a central storage place which is managed by the core code. We've had that back in the days when every architecture had it's own magic place to store and manage interrupt descriptors. Seen that, mopped it up and never want to go back. > I suppose the reason to avoid this is because alot of the driver > facing API is now built on vector index numbers that index this > xarray? That's one aspect, but as I demonstrate later even for the IMS domains which do not have a real requirement for 'index' you still need to have a place to store the MSI descriptor and allocate storage space for it. I really don't want to have random places doing that because then I can't provide implicit MSI descriptor management, e.g. automatic alloc/free anymore and everything has to happen at the driver side. The only reason why I still need to do that for PCI/MSI is to be able to support the museum architectures which still depend on the arch_....() interfaces from 20 years ago. So if a IMS domain, which e.g. stores the MSI message in queue memory, wants a new interrupt then it allocates it with MSI_ANY_INDEX, which gives it the next free slot in the XARRAY section of the MSI domain. This avoids having IDA, bitmap allocators or whatever at the driver side and having a virtual index number to track things does not affect the flexibility of the driver side in any way. All the driver needs at the very end is the interrupt number and the message itself. > But on the other hand can we just say drivers using multiple domains > are "new" and they should use some new style pointer based interface > so we don't have to have arrays of things? Then driver writers have to provide storage for the domain pointer and care about teardown etc. Seriously? NO! > At least, I'd like to understand a bit better the motivation for using > a domain ID instead of a pointer. The main motivation was to avoid device specific storage for the irq domain pointers. It would have started with PCI/MSI[X]: I'd had to add a irqdomain pointer to struct pci_dev and then have the PCI core care about it. So we'd add that to everything and the world which utilizes per device MSI domains which is quite a few places outside of PCI in the ARM64 world and growing. The msi_device_data struct which is allocated on demand for MSI usage is the obvious point to store _and_ manage these things, i.e. managed teardown etc. Giving this up makes any change to the core code hard because you have to chase all usage sites and mop them up. Just look at the ARM part of this series which is by now 40+ patches just to mop up the irqchip core. There are still 25 PCI/MSI global irqdomain left. > It feels like we are baking in several hard coded limits with this > choice Which ones? The chosen array section size per domain is arbitrary and can be changed at any given time. Though you have to exhaust 64k vectors per domain first before we start debating that. The number of irqdomains is not really hard limited either. It's trivial enough to extend that number and once we hit 32 we just can stash them away in the xarray. I pondered to do that right away, but that wastes too much memory for now. It really does not matter whether the domain creation results in a number or in a pointer. Pointers are required for the inner workings of the domain hierarchy but absolutely uninteresting for endpoint domains. All you need there is a conveniant way to create the domain and then allocate/free interrupts as you see fit. We agreed a year ago that we want to abstract most of these things away for driver writers and that all they need is simple way to create the domains and the corresponding interrupt chip is mostly about writing the MSI message to implementation defined storage and eventually providing a implementation specific mask/unmask operation. So what are you concerned about? Thanks, tglx