On Fri, Apr 29, 2022 at 09:54:42AM -0300, Jason Gunthorpe wrote: > On Fri, Apr 29, 2022 at 04:00:14PM +1000, David Gibson wrote: > > > But I don't have a use case in mind? The simplified things I know > > > about want to attach their devices then allocate valid IOVA, they > > > don't really have a notion about what IOVA regions they are willing to > > > accept, or necessarily do hotplug. > > > > The obvious use case is qemu (or whatever) emulating a vIOMMU. The > > emulation code knows the IOVA windows that are expected of the vIOMMU > > (because that's a property of the emulated platform), and requests > > them of the host IOMMU. If the host can supply that, you're good > > (this doesn't necessarily mean the host windows match exactly, just > > that the requested windows fit within the host windows). If not, > > you report an error. This can be done at any point when the host > > windows might change - so try to attach a device that can't support > > the requested windows, and it will fail. Attaching a device which > > shrinks the windows, but still fits the requested windows within, and > > you're still good to go. > > We were just talking about this in another area - Alex said that qemu > doesn't know the IOVA ranges? Is there some vIOMMU cases where it does? Uh.. what? We certainly know (or, rather, choose) the IOVA ranges for ppc. That is to say we set up the default IOVA ranges at machine construction (those defaults have changed with machine version a couple of times). If the guest uses dynamic DMA windows we then update those ranges based on the hypercalls, but at any point we know what the IOVA windows are supposed to be. I don't really see how x86 or anything else could not know the IOVA ranges. Who else *could* set the ranges when implementing a vIOMMU in TCG mode? For the non-vIOMMU case then IOVA==GPA, so everything qemu knows about the GPA space it also knows about the IOVA space. Which, come to think of it, means memory hotplug also complicates things. > Even if yes, qemu is able to manage this on its own - it doesn't use > the kernel IOVA allocator, so there is not a strong reason to tell the > kernel what the narrowed ranges are. I don't follow. The problem for the qemu case here is if you hotplug a device which narrows down the range to something smaller than the guest expects. If qemu has told the kernel the ranges it needs, that can just fail (which is the best you can do). If the kernel adds the device but narrows the ranges, then you may have just put the guest into a situation where the vIOMMU cannot do what the guest expects it to. If qemu can only query the windows, not specify them then it won't know that adding a particular device will conflict with its guest side requirements until after it's already added. That could mess up concurrent guest initiated map operations for existing devices in the same guest side domain, so I don't think reversing the hotplug after the problem is detected is enough. > > > That is one possibility, yes. qemu seems to be using this to establish > > > a clone ioas of an existing operational one which is another usage > > > model. > > > > Right, for qemu (or other hypervisors) the obvious choice would be to > > create a "staging" IOAS where IOVA == GPA, then COPY that into the various > > emulated bus IOASes. For a userspace driver situation, I'm guessing > > you'd map your relevant memory pool into an IOAS, then COPY to the > > IOAS you need for whatever specific devices you're using. > > qemu seems simpler, it juggled multiple containers so it literally > just copies when it instantiates a new container and does a map in > multi-container. I don't follow you. Are you talking about the vIOMMU or non vIOMMU case? In the vIOMMU case the different containers can be for different guest side iommu domains with different guest-IOVA spaces, so you can't just copy from one to another. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
Attachment:
signature.asc
Description: PGP signature