On Fri, 24 Apr 2015 19:18:44 +0100 Stuart Yoder <stuart.yoder@xxxxxxxxxxxxx> wrote: Hi Stuart, > > > > -----Original Message----- > > From: Marc Zyngier [mailto:marc.zyngier@xxxxxxx] > > Sent: Friday, April 24, 2015 11:44 AM > > To: Will Deacon; Yoder Stuart-B08248 > > Cc: Sethi Varun-B16395; Lian Minghuan-B31939; linux-pci@xxxxxxxxxxxxxxx; Arnd Bergmann; Hu Mingkai-B21284; > > Zang Roy-R61911; Bjorn Helgaas; Wood Scott-B07421; linux-arm-kernel@xxxxxxxxxxxxxxxxxxx > > Subject: Re: [PATCH 1/2] irqchip/gicv3-its: Support share device ID > > > > On 24/04/15 17:18, Will Deacon wrote: > > > On Wed, Apr 22, 2015 at 08:41:02PM +0100, Stuart Yoder wrote: > > >>>> However, there is an improvement we envision as possible due to > > >>>> the limited number of SMMU contexts (i.e. 64). If there are > > >>>> 64 SMMU context registers it means that there is a max of > > >>>> 64 software contexts where things can be isolated. But, say I have > > >>>> an SRIOV card with 64 VFs, and I want to assign 8 of the VFs > > >>>> to a KVM VM. Those 8 PCI devices could share the same > > >>>> streamID/ITS-device-ID since they all share the same isolation > > >>>> context. > > >>>> > > >>>> What would be nice is at the time the 8 VFS are being added > > >>>> to the IOMMU domain is for the pcidevid -> streamID mapping > > >>>> table to be updated dynamically. It simply lets us make > > >>>> more efficient use of the limited streamIDs we have. > > >>>> > > >>>> I think it is this improvement that Minghuan had in mind > > >>>> in this patch. > > >>> > > >>> Ok, but in this case it should be possible to use a single context bank for > > >>> all of the VF streamIDs by configuring the appropriate SMR, no? > > >> > > >> Yes, but there are limited SMRs. In our case there are only > > >> 128 SMRs in SMMU-500 and we have potentially way more masters than > > >> that. > > > > > > Right, but you still only have 64 context banks at the end of the day, so do > > > you really anticipate having more than 128 masters concurrently using the > > > SMMU? If so, then we have devices sharing context banks so we could consider > > > reusing SMRs across masters, but historically that's not been something that > > > we've managed to solve. > > > > > >>> Wouldn't > > >>> that sort of thing be preferable to dynamic StreamID assignment? It would > > >>> certainly make life easier for the MSIs. > > >> > > >> It would be preferable, but given only 128 total stream IDS and > > >> 64 context registers it's potentially an issue. On our LS2085 SoC it is > > >> PCI and the fsl-mc bus (see description here: > > >> https://lkml.org/lkml/2015/3/5/795) that potentially have way > > >> more masters than streamIDS. So, for those busses we would essentially > > >> view a streamID as a "context ID"-- each SMR is associated with > > >> 1 context bank register. > > >> > > >> For PCI we have a programmable "PCI req ID"-to-"stream ID" > > >> mapping table in the PCI controller that is dynamically > > >> programmable. > > >> > > >> Looking at it like that means that we could have > > >> any number of masters but only 64 "contexts" > > >> and since the masters all all programmable it's > > >> seems feasbile to envision doing some bus/vendor > > >> specific set up when a device is added to an > > >> IOMMU domain. One thing that would need to be conveyed > > >> to the SMMU driver if doing dynamic streamID setup > > >> is what streamIDs are available to be used. > > > > > > Ok, but this is going to make life difficult for the MSI people, I suspect. > > > > > > Marc...? > > > > We're really facing two conflicting requirements: in order to minimize > > SMR usage, we want to alias multiple ReqIDs to a single StreamID > > Another reason could be the isolation characteristics of the > hardware...see comment below about PCI bridges. > > > but in > > order to efficiently deal with MSIs, we want to see discrete DeviceIDs > > (the actual ReqIDs). I don't easily see how we reconcile the two. > > > > We can deal with the aliasing, provided that we extend the level of > > quirkiness that pci_for_each_dma_alias can deal with. But that > > doesn't solve any form of hotplug/SR-IOV behaviour. > > > > Somehow, we're going to end-up with grossly oversized ITTs, just to > > accommodate for the fact that we have no idea how many MSIs we're > > going to end-up needing. I'm not thrilled with that prospect. > > How can we avoid that in the face of hotplug? Fortunately, hotplug is not always synonymous of aliasing. The ITS is built around the hypothesis that aliasing doesn't happen, and that you know upfront how many LPIs the device will be allowed to generate. > And what are we really worried about regarding over-sized ITTs...bytes > of memory saved? That's one thing, yes. But more fundamentally, how do you size your MSI capacity for a single alias? Do you evenly split your LPI space among all possible aliases? Assuming 64 aliases and 16 bits of interrupt ID space, you end up with 10 bit per alias. Is that always enough? Or do you need something more fine-grained? > A fundamental thing built into the IOMMU subsystem in Linux is > representing iommu groups that can represent things like > multiple PCI devices that for hardware reasons cannot > be isolated (and the example I've seen given relates to > devices behind PCI bridges). > > So, I think the thing we are facing here is that while the > IOMMU subsystem has accounted for reprsenting the isolation > characteristics of a system with iommu groups, there is > no corresponding "msi group" concept. > > In the SMMU/GIC-500-ITS world the iommu isolation > ID (the stream ID) and the GIC-ITS device ID are in > fact the same ID. The DeviceID is the "MSI group" you mention. This is what provides isolation at the ITS level. > Is there some way we could sanely correlate IOMMU group creation > (which establishes isolation granularity) with the creation > of an ITT for the GIC-ITS? The problem you have is that your ITT already exists before you start "hotpluging" new devices. Take the following (made up) example: System boots, device X is discovered, claims 64 MSIs. An ITT for device X is allocated, and sized for 64 LPIs. SR-IOV kick is, creates a new X' function that is aliased to X, claiming another 64 MSIs. Fail. What do we do here? The ITT is live (X is generating interrupts), and there is no provision to resize it (I've come up with a horrible scheme, but that could fail as well). The only sane option would be to guess how many MSIs a given alias could possibly use. How wrong is this guess going to be? The problem we have is that IOMMU groups are dynamic, while ITT allocation is completely static for a given DeviceID. The architecture doesn't give you any mechanism to resize it, and I have the ugly feeling that static allocation of the ID space to aliases is too rigid... M. -- Jazz is not dead. It just smells funny. -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html