On 09/11/16 18:59, Don Dutile wrote: > On 11/09/2016 12:03 PM, Will Deacon wrote: >> On Tue, Nov 08, 2016 at 09:52:33PM -0500, Don Dutile wrote: >>> On 11/08/2016 06:35 PM, Alex Williamson wrote: >>>> On Tue, 8 Nov 2016 21:29:22 +0100 >>>> Christoffer Dall <christoffer.dall@xxxxxxxxxx> wrote: >>>>> Is my understanding correct, that you need to tell userspace about the >>>>> location of the doorbell (in the IOVA space) in case (2), because even >>>>> though the configuration of the device is handled by the (host) kernel >>>>> through trapping of the BARs, we have to avoid the VFIO user >>>>> programming >>>>> the device to create other DMA transactions to this particular >>>>> address, >>>>> since that will obviously conflict and either not produce the desired >>>>> DMA transactions or result in unintended weird interrupts? >> >> Yes, that's the crux of the issue. >> >>>> Correct, if the MSI doorbell IOVA range overlaps RAM in the VM, then >>>> it's potentially a DMA target and we'll get bogus data on DMA read from >>>> the device, and lose data and potentially trigger spurious >>>> interrupts on >>>> DMA write from the device. Thanks, >>>> >>> That's b/c the MSI doorbells are not positioned *above* the SMMU, i.e., >>> they address match before the SMMU checks are done. if >>> all DMA addrs had to go through SMMU first, then the DMA access could >>> be ignored/rejected. >> >> That's actually not true :( The SMMU can't generally distinguish >> between MSI >> writes and DMA writes, so it would just see a write transaction to the >> doorbell address, regardless of how it was generated by the endpoint. >> >> Will >> > So, we have real systems where MSI doorbells are placed at the same IOVA > that could have memory for a guest, but not at the same IOVA as memory > on real hw ? MSI doorbells integral to PCIe root complexes (and thus untranslatable) typically have a programmable address, so could be anywhere. In the more general category of "special hardware addresses", QEMU's default ARM guest memory map puts RAM starting at 0x40000000; on the ARM Juno platform, that happens to be where PCI config space starts; as Juno's PCIe doesn't support ACS, peer-to-peer or anything clever, if you assign the PCI bus to a guest (all of it, given the lack of ACS), the root complex just sees the guest's attempts to DMA to "memory" as the device attempting to access config space and aborts them. > How are memory holes passed to SMMU so it doesn't have this issue for > bare-metal > (assign an IOVA that overlaps an MSI doorbell address)? When we *are* in full control of the IOVA space, we just carve out what we can find as best we can - see iova_reserve_pci_windows() in dma-iommu.c, which isn't really all that different to what x86 does (e.g. init_reserved_iova_ranges() in amd-iommu.c). Note that we don't actually have any way currently to discover upstream MSI doorbells (ponder dw_pcie_msi_init() in pcie-designware.c for an example of the problem) - the specific MSI support we have in DMA ops at the moment only covers GICv2m or GICv3 ITS downstream of translation, but fortunately that's the typical relevant use-case on current platforms. Robin. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html