Re: [PATCH 2/8] PCI: Add pci_find_common_upstream_dev()

Logan Gunthorpe <logang@xxxxxxxxxxxx> · Wed, 28 Mar 2018 12:57:57 -0600

On 28/03/18 12:28 PM, Christian König wrote:
> I'm just using amdgpu as blueprint because I'm the co-maintainer of it 
> and know it mostly inside out.

Ah, I see.

> The resource addresses are translated using dma_map_resource(). As far 
> as I know that should be sufficient to offload all the architecture 
> specific stuff to the DMA subsystem.

It's not. The dma_map infrastructure currently has no concept of
peer-to-peer mappings and is designed for system memory only. No
architecture I'm aware of will translate PCI CPU addresses into PCI Bus
addresses which is necessary for any transfer that doesn't go through
the root complex (though on arches like x86 the CPU and Bus address
happen to be the same). There's a lot of people that would like to see
this change but it's likely going to be a long road before it does.

Furthermore, one of the reasons our patch-set avoids going through the
root complex at all is that IOMMU drivers will need to be made aware
that it is operating on P2P memory and do arch-specific things
accordingly. There will also need to be flags that indicate whether a
given IOMMU driver supports this. None of this work is done or easy.

> Yeah, but not for ours. See if you want to do real peer 2 peer you need 
> to keep both the operation as well as the direction into account.

Not sure what you are saying here... I'm pretty sure we are doing "real"
peer 2 peer...

> For example when you can do writes between A and B that doesn't mean 
> that writes between B and A work. And reads are generally less likely to 
> work than writes. etc...

If both devices are behind a switch then the PCI spec guarantees that A
can both read and write B and vice versa. Only once you involve root
complexes do you have this problem. Ie. you have unknown support which
may be no support, or partial support (stores but not loads); or
sometimes bad performance; or a combination of both... and you need some
way to figure out all this mess and that is hard. Whoever tries to
implement a white list will have to sort all this out.

Logan