On Thu, Sep 4, 2014 at 7:31 PM, Rusty Russell <rusty@xxxxxxxxxxxxxxx> wrote: > Andy Lutomirski <luto@xxxxxxxxxxxxxx> writes: >> On Sep 2, 2014 11:53 PM, "Rusty Russell" <rusty@xxxxxxxxxxxxxxx> wrote: >>> >>> Andy Lutomirski <luto@xxxxxxxxxxxxxx> writes: >>> > There really are virtio devices that are pieces of silicon and not >>> > figments of a hypervisor's imagination [1]. >>> >>> Hi Andy, >>> >>> As you're discovering, there's a reason no one has done the DMA >>> API before. >>> >>> So the problem is that ppc64's IOMMU is a platform thing, not a bus >>> thing. They really do carve out an exception for virtio devices, >>> because performance (LOTS of performance). It remains to be seen if >>> other platforms have the same performance issues, but in absence of >>> other evidence, the answer is yes. >>> >>> It's a hack. But having specific virtual-only devices are an even >>> bigger hack. >>> >>> Physical virtio devices have been talked about, but don't actually exist >>> in Real Life. And someone a virtio PCI card is going to have serious >>> performance issues: mainly because they'll want the rings in the card's >>> MMIO region, not allocated by the driver. Being broken on PPC is really >>> the least of their problems. >>> >>> So, what do we do? It'd be nice if Linux virtio Just Worked under Xen, >>> though Xen's IOMMU is outside the virtio spec. Since virtio_pci can be >>> a module, obvious hacks like having xen_arch_setup initialize a dma_ops pointer >>> exposed by virtio_pci.c is out. >> >> Xen does expose dma_ops. The trick is knowing when to use it. >> >>> >>> I think the best approach is to have a new feature bit (25 is free), >>> VIRTIO_F_USE_BUS_MAPPING which indicates that a device really wants to >>> use the mapping for the bus it is on. A real device would set this, >>> or it won't work behind an IOMMU. A Xen device would also set this. >> >> The devices I care about aren't actually Xen devices. They're devices >> supplied by QEMU/KVM, booting a Xen hypervisor, which in turn passes >> the virtio device (along with every other PCI device) through to dom0. >> So this is exactly the same virtio device that regular x86 KVM guests >> would see. The reason that current code fails is that Xen guest >> physical addresses aren't the same as the addresses seen by the outer >> hypervisor. >> >> These devices don't know that physical addresses != bus addresses, so >> they can't advertise that fact. > > Ah, I see. Then we will need a Xen-specific hack. > >> Grr. This is mostly a result of the fact that virtio_pci devices >> aren't really PCI devices. I still think that virtio_pci shouldn't >> have to worry about this; ideally this would all be handled higher up >> in the device hierarchy. x86 already gets this right. > > Yes. Adding a feature to say "I am a real PCI device" is possible, but > has other issues (particularly as Michael Tsirkin pointed out, what do > you do if the driver doesn't understand the feature). > >> Are there any hypervisors except PPC that use virtio_pci, have IOMMUs >> on the pci slot that virtio_pci lives in, and that use physical >> addressing? If not, I think that just quirking PPC will work (at >> least until someone wants IOMMU support in virtio_pci on PPC, in which >> case doing something using devicetree seems like a reasonable >> solution). > > We can either patch to make PPC weird or make Xen weird. I'm on the > fence. > > Two questions for Paulo: > 1) When QEMU support IOMMU on x86, will the virtio devices behind it > respect the IOMMU (do they use the right memory access primitives?). > > 2) Are we really going to be able to exclude virtio devices from using > the x86 IOMMU in a portable way which will always work? If it's > per-bus granularity, will qemu really put them on their own PCI bus > and get this right? Or will it sometimes get it wrong and users will > end up using virtio devices via IOMMU by accident? > > If the answers are both "yes", then x86 is going to be able to use > virtio+IOMMU, so PPC looks like the odd one out. Otherwise it looks > like we're really going to want to stick with the "ignore IOMMU" rule > until (handwave future), and we make an exception for Xen. There's a third option: try to make virtio-mmio work everywhere (except s390), at least in the long run. This other benefits: it makes minimal hypervisors simpler, I think it'll get rid of the limits on the number of virtio devices in a system. ARM is already going this direction, and I imagine that PPC support would be straightforward (it's already using devicetree). Does virtio-mmio have any reasonable way of doing hotplug? It could also eventually make sense to have a standard for virtio on virtio. --Andy _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/virtualization