On Tue, 2015-11-10 at 13:04 +1100, Benjamin Herrenschmidt wrote: > On Mon, 2015-11-09 at 16:46 -0800, Andy Lutomirski wrote: > > The problem here is that in some of the problematic cases the > > virtio > > driver may not even be loaded. If someone runs an L1 guest with an > > IOMMU-bypassing virtio device and assigns it to L2 using vfio, then > > *boom* L1 crashes. (Same if, say, DPDK gets used, I think.) > > > > > > > > The only way out of this while keeping the "platform" stuff would > > > be to > > > also bump some kind of version in the virtio config (or PCI > > > header). I > > > have no other way to differenciate between "this is an old qemu > > > that > > > doesn't do the 'bypass property' yet" from "this is a virtio > > > device > > > that doesn't bypass". > > > > > > Any better idea ? > > > > I'd suggest that, in the absence of the new DT binding, we assume > > that > > any PCI device with the virtio vendor ID is passthrough on powerpc. > > I > > can do this in the virtio driver, but if it's in the platform code > > then vfio gets it right too (i.e. fails to load). > > The problem is there isn't *a* virtio vendor ID. It's the RedHat > vendor > ID which will be used by more than just virtio, so we need to > specifically list the devices. > > Additionally, that still means that once we have a virtio device that > actually uses the iommu, powerpc will not work since the "workaround" > above will kick in. > > The "in absence of the new DT binding" doesn't make that much sense. > > Those platforms use device-trees defined since the dawn of ages by > actual open firmware implementations, they either have no iommu > representation in there (Macs, the platform code hooks it all up) or > have various properties related to the iommu but no concept of > "bypass" > in there. > > We can *add* a new property under some circumstances that indicates a > bypass on a per-device basis, however that doesn't completely solve > it: > > - As I said above, what does the absence of that property mean ? An > old qemu that does bypass on all virtio or a new qemu trying to tell > you that the virtio device actually does use the iommu (or some other > environment that isn't qemu) ? > > - On things like macs, the device-tree is generated by openbios, it > would have to have some added logic to try to figure that out, which > means it needs to know *via different means* that some or all virtio > devices bypass the iommu. > > I thus go back to my original statement, it's a LOT easier to handle > if > the device itself is self describing, indicating whether it is set to > bypass a host iommu or not. For L1->L2, well, that wouldn't be the > first time qemu/VFIO plays tricks with the passed through device > configuration space... > > Note that the above can be solved via some kind of compromise: The > device self describes the ability to honor the iommu, along with the > property (or ACPI table entry) that indicates whether or not it does. > > IE. We could use the revision or ProgIf field of the config space for > example. Or something in virtio config. If it's an "old" device, we > know it always bypass. If it's a new device, we know it only bypasses > if the corresponding property is in. I still would have to sort out > the > openbios case for mac among others but it's at least a workable > direction. > > BTW. Don't you have a similar problem on x86 that today qemu claims > that everything honors the iommu in ACPI ? > > Unless somebody can come up with a better idea... Can something be done by means of PCIe capabilities? ATS (Address Translation Support) seems like a natural choice? Knut > Cheers, > Ben. > > -- > To unsubscribe from this list: send the line "unsubscribe sparclinux" > in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html