On Wed, Jun 23, 2021 at 11:57 AM Christian König <ckoenig.leichtzumerken@xxxxxxxxx> wrote: > > Am 22.06.21 um 18:05 schrieb Jason Gunthorpe: > > On Tue, Jun 22, 2021 at 05:48:10PM +0200, Christian König wrote: > >> Am 22.06.21 um 17:40 schrieb Jason Gunthorpe: > >>> On Tue, Jun 22, 2021 at 05:29:01PM +0200, Christian König wrote: > >>>> [SNIP] > >>>> No absolutely not. NVidia GPUs work exactly the same way. > >>>> > >>>> And you have tons of similar cases in embedded and SoC systems where > >>>> intermediate memory between devices isn't directly addressable with the CPU. > >>> None of that is PCI P2P. > >>> > >>> It is all some specialty direct transfer. > >>> > >>> You can't reasonably call dma_map_resource() on non CPU mapped memory > >>> for instance, what address would you pass? > >>> > >>> Do not confuse "I am doing transfers between two HW blocks" with PCI > >>> Peer to Peer DMA transfers - the latter is a very narrow subcase. > >>> > >>>> No, just using the dma_map_resource() interface. > >>> Ik, but yes that does "work". Logan's series is better. > >> No it isn't. It makes devices depend on allocating struct pages for their > >> BARs which is not necessary nor desired. > > Which dramatically reduces the cost of establishing DMA mappings, a > > loop of dma_map_resource() is very expensive. > > Yeah, but that is perfectly ok. Our BAR allocations are either in chunks > of at least 2MiB or only a single 4KiB page. > > Oded might run into more performance problems, but those DMA-buf > mappings are usually set up only once. > > >> How do you prevent direct I/O on those pages for example? > > GUP fails. > > At least that is calming. > > >> Allocating a struct pages has their use case, for example for exposing VRAM > >> as memory for HMM. But that is something very specific and should not limit > >> PCIe P2P DMA in general. > > Sure, but that is an ideal we are far from obtaining, and nobody wants > > to work on it prefering to do hacky hacky like this. > > > > If you believe in this then remove the scatter list from dmabuf, add a > > new set of dma_map* APIs to work on physical addresses and all the > > other stuff needed. > > Yeah, that's what I totally agree on. And I actually hoped that the new > P2P work for PCIe would go into that direction, but that didn't > materialized. > > But allocating struct pages for PCIe BARs which are essentially > registers and not memory is much more hacky than the dma_resource_map() > approach. > > To re-iterate why I think that having struct pages for those BARs is a > bad idea: Our doorbells on AMD GPUs are write and read pointers for ring > buffers. > > When you write to the BAR you essentially tell the firmware that you > have either filled the ring buffer or read a bunch of it. This in turn > then triggers an interrupt in the hardware/firmware which was eventually > asleep. > > By using PCIe P2P we want to avoid the round trip to the CPU when one > device has filled the ring buffer and another device must be woken up to > process it. > > Think of it as MSI-X in reverse and allocating struct pages for those > BARs just to work around the shortcomings of the DMA API makes no sense > at all to me. We would also like to do that *in the future*. In Gaudi it will never be supported (due to security limitations) but I definitely see it happening in future ASICs. Oded > > > We also do have the VRAM BAR, and for HMM we do allocate struct pages > for the address range exposed there. But this is a different use case. > > Regards, > Christian. > > > > > Otherwise, we have what we have and drivers don't get to opt out. This > > is why the stuff in AMDGPU was NAK'd. > > > > Jason >