On 21/08/18 05:45 PM, Eric Pilmore wrote: > Well, the only difference between success and failure is running with the > call to dma_map_resource for the destination address, which is a PCI BAR > address. Prior to Kit introducing this call, we never created a mapping for the > destination PCI BAR address and it worked fine on all systems when using > PLX DMA. It was only when we went to a Xeon system and attempted to use > IOAT DMA that we found we needed a mapping for that destination PCI BAR > address. > > The only thing the PLX driver does related to "mappings" is a call to > dma_descriptor_unmap when the descriptor is freed, however that is more > of an administrative step to clean up the unmap-data data structure used > when the mapping was originally established. Ah, so there's a big difference here in hardware. Without the mapping call you are going to essentially be doing a P2P transaction. So the TLPs won't even hit the CPU. With the mapping call, the TLPs will now go through the CPU and IOMMU. CPUs don't always have good support for routing PCI P2P transactions. However, I would have expected a relatively new i7 CPU to support it well. It may simply be that this CPU does not have good support, however that comes as a bit of a surprise to me. Our plans for the P2P patch-set was to have a white-list for CPUs that work. If you can, it would be worth hooking up a PCI analyzer to see what's happening to the TLPs in both cases. Logan