Hi Bjorn, On Mon, Dec 19, 2016 at 03:20:44PM -0600, Bjorn Helgaas wrote: > I have some questions about dmar_init_reserved_ranges(). On systems > where CPU physical address space is not identity-mapped to PCI bus > address space, e.g., where the PCI host bridge windows have _TRA > offsets, I'm not sure we're doing the right thing. > > Assume we have a PCI host bridge with _TRA that maps CPU addresses > 0x80000000-0x9fffffff to PCI bus addresses 0x00000000-0x1fffffff, with > two PCI devices below it: > > PCI host bridge domain 0000 [bus 00-3f] > PCI host bridge window [mem 0x80000000-0x9fffffff] (bus 0x00000000-0x1fffffff] > 00:00.0: BAR 0 [mem 0x80000000-0x8ffffffff] (0x00000000-0x0fffffff on bus) > 00:01.0: BAR 0 [mem 0x90000000-0x9ffffffff] (0x10000000-0x1fffffff on bus) > > The IOMMU init code in dmar_init_reserved_ranges() reserves the PCI > MMIO space for all devices: > > pci_iommu_init() > intel_iommu_init() > dmar_init_reserved_ranges() > reserve_iova(0x80000000-0x8ffffffff) > reserve_iova(0x90000000-0x9ffffffff) > > This looks odd because we're reserving CPU physical addresses, but > the IOVA space contains *PCI bus* addresses. On most x86 systems they > would be the same, but not on all. Interesting, I wasn't aware of that. Looks like we are not doing the right thing in dmar_init_reserved_ranges(). How is that handled without an IOMMU, when the bus-addresses overlap with ram addresses? > Assume the driver for 00:00.0 maps a page of main memory for DMA. It > may receive a dma_addr_t of 0x10000000: > > 00:00.0: intel_map_page() returns dma_addr_t 0x10000000 > 00:00.0: issues DMA to 0x10000000 > > What happens here? The DMA access should go to main memory. In > conventional PCI it would be a peer-to-peer access to device 00:01.0. > Is there enough PCIe smarts (ACS or something?) to do otherwise? If there is a bridge doing ACS between the devices, the IOMMU will see the request and re-map it to its RAM address. > The dmar_init_reserved_ranges() comment says "Reserve all PCI MMIO to > avoid peer-to-peer access." Without _TRA, CPU addresses and PCI bus > addresses would be identical, and I think these reserve_iova() calls > *would* prevent this situation. So maybe we're just missing a > pcibios_resource_to_bus() here? I'll have a look, the AMD IOMMU driver implements this too, so it needs also be fixed there. Do you know which x86 systems are configured like this? Joerg -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html