On Wed, Aug 08, 2018 at 08:07:49PM +1000, Benjamin Herrenschmidt wrote: > Qemu virtio bypasses that iommu when the VIRTIO_F_IOMMU_PLATFORM flag > is not set (default) but there's nothing in the device-tree to tell the > guest about this since it's a violation of our pseries architecture, so > we just rely on Linux virtio "knowing" that it happens. It's a bit > yucky but that's now history... That is ugly as hell, but it is how virtio works everywhere, so nothing special so far. > Essentially pseries "architecturally" does not have the concept of not > having an iommu in the way and qemu violates that architecture today. > > (Remember it comes from pHyp, our priorietary HV, which we are somewhat > mimmicing here). It shouldnt be too hard to have a dt property that communicates this, should it? > So if we always set VIRTIO_F_IOMMU_PLATFORM, it *will* force all virtio > through that iommu and performance will suffer (esp vhost I suspect), > especially since adding/removing translations in the iommu is a > hypercall. Well, we'd nee to make sure that for this particular bus we skip the actualy iommu. > > It would not be the same effect. The problem with that is that you must > > now assumes that your qemu knows that for example you might be passing > > a dma offset if the bus otherwise requires it. > > I would assume that arch_virtio_wants_dma_ops() only returns true when > no such offsets are involved, at least in our case that would be what > happens. That would work, but we're really piling hacĸs ontop of hacks here. > > Or in other words: > > you potentially break the contract between qemu and the guest of always > > passing down physical addresses. If we explicitly change that contract > > through using a flag that says you pass bus address everything is fine. > > For us a "bus address" is behind the iommu so that's what > VIRTIO_F_IOMMU_PLATFORM does already. We don't have the concept of a > bus address that is different. I suppose it's an ARMism to have DMA > offsets that are separate from iommus ? No, a lot of platforms support a bus address that has an offset from the physical address. including a lot of power platforms: arch/powerpc/kernel/pci-common.c: set_dma_offset(&dev->dev, PCI_DRAM_OFFSET); arch/powerpc/platforms/cell/iommu.c: set_dma_offset(dev, cell_dma_nommu_offset); arch/powerpc/platforms/cell/iommu.c: set_dma_offset(dev, addr); arch/powerpc/platforms/powernv/pci-ioda.c: set_dma_offset(&pdev->dev, pe->tce_bypass_base); arch/powerpc/platforms/powernv/pci-ioda.c: set_dma_offset(&pdev->dev, (1ULL << 32)); arch/powerpc/platforms/powernv/pci-ioda.c: set_dma_offset(&dev->dev, pe->tce_bypass_base); arch/powerpc/platforms/pseries/iommu.c: set_dma_offset(dev, dma_offset); arch/powerpc/sysdev/dart_iommu.c: set_dma_offset(&dev->dev, DART_U4_BYPASS_BASE); arch/powerpc/sysdev/fsl_pci.c: set_dma_offset(dev, pci64_dma_offset); to make things worse some platforms (at least on arm/arm64/mips/x86) can also require additional banking where it isn't even a single linear map but multiples windows. _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/virtualization