On Fri, Feb 21, 2020 at 07:07:02PM +0100, Halil Pasic wrote: > On Fri, 21 Feb 2020 10:48:15 -0500 > "Michael S. Tsirkin" <mst@xxxxxxxxxx> wrote: > > > On Fri, Feb 21, 2020 at 02:06:39PM +0100, Halil Pasic wrote: > > > On Fri, 21 Feb 2020 14:27:27 +1100 > > > David Gibson <david@xxxxxxxxxxxxxxxxxxxxx> wrote: > > > > > > > On Thu, Feb 20, 2020 at 05:31:35PM +0100, Christoph Hellwig wrote: > > > > > On Thu, Feb 20, 2020 at 05:23:20PM +0100, Christian Borntraeger wrote: > > > > > > >From a users perspective it makes absolutely perfect sense to use the > > > > > > bounce buffers when they are NEEDED. > > > > > > Forcing the user to specify iommu_platform just because you need bounce buffers > > > > > > really feels wrong. And obviously we have a severe performance issue > > > > > > because of the indirections. > > > > > > > > > > The point is that the user should not have to specify iommu_platform. > > > > > We need to make sure any new hypervisor (especially one that might require > > > > > bounce buffering) always sets it, > > > > > > > > So, I have draft qemu patches which enable iommu_platform by default. > > > > But that's really because of other problems with !iommu_platform, not > > > > anything to do with bounce buffering or secure VMs. > > > > > > > > The thing is that the hypervisor *doesn't* require bounce buffering. > > > > In the POWER (and maybe s390 as well) models for Secure VMs, it's the > > > > *guest*'s choice to enter secure mode, so the hypervisor has no reason > > > > to know whether the guest needs bounce buffering. As far as the > > > > hypervisor and qemu are concerned that's a guest internal detail, it > > > > just expects to get addresses it can access whether those are GPAs > > > > (iommu_platform=off) or IOVAs (iommu_platform=on). > > > > > > I very much agree! > > > > > > > > > > > > as was a rather bogus legacy hack > > > > > > > > It was certainly a bad idea, but it was a bad idea that went into a > > > > public spec and has been widely deployed for many years. We can't > > > > just pretend it didn't happen and move on. > > > > > > > > Turning iommu_platform=on by default breaks old guests, some of which > > > > we still care about. We can't (automatically) do it only for guests > > > > that need bounce buffering, because the hypervisor doesn't know that > > > > ahead of time. > > > > > > Turning iommu_platform=on for virtio-ccw makes no sense whatsover, > > > because for CCW I/O there is no such thing as IOMMU and the addresses > > > are always physical addresses. > > > > Fix the name then. The spec calls is ACCESS_PLATFORM now, which > > makes much more sense. > > I don't quite get it. Sorry. Maybe I will revisit this later. Halil, I think I can clarify this. The "iommu_platform" flag doesn't necessarily have anything to do with an iommu, although it often will. Basically it means "access guest memory via the bus's normal DMA mechanism" rather than "access guest memory using GPA, because you're the hypervisor and you can do that". For the case of ccw, both mechanisms end up being the same thing, since CCW's normal DMA *is* untranslated GPA access. For this reason, the flag in the spec was renamed to ACCESS_PLATFORM, but the flag in qemu still has the old name. AIUI, Michael is saying you could trivially change the name in qemu (obviously you'd need to alias the old name to the new one for compatibility). Actually, the fact that ccw has no translation makes things easier for you: you don't really have any impediment to turning ACCESS_PLATFORM on by default, since it doesn't make any real change to how things work. The remaining difficulty is that the virtio driver - since it can sit on multiple buses - won't know this, and will reject the ACCESS_PLATFORM flag, even though it could just do what it normally does on ccw and it would work. For that case, we could consider a hack in qemu where for virtio-ccw devices *only* we allow the guest to nack the ACCESS_PLATFORM flag and carry on anyway. Normally we insist that the guest accept the ACCESS_PLATFORM flag if offered, because on most platforms they *don't* amount to the same thing. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
Attachment:
signature.asc
Description: PGP signature
_______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/virtualization