On Wed, Jun 23, 2021 at 10:57:35AM +0200, Christian König wrote: > > > No it isn't. It makes devices depend on allocating struct pages for their > > > BARs which is not necessary nor desired. > > Which dramatically reduces the cost of establishing DMA mappings, a > > loop of dma_map_resource() is very expensive. > > Yeah, but that is perfectly ok. Our BAR allocations are either in chunks of > at least 2MiB or only a single 4KiB page. And very small apparently > > > Allocating a struct pages has their use case, for example for exposing VRAM > > > as memory for HMM. But that is something very specific and should not limit > > > PCIe P2P DMA in general. > > Sure, but that is an ideal we are far from obtaining, and nobody wants > > to work on it prefering to do hacky hacky like this. > > > > If you believe in this then remove the scatter list from dmabuf, add a > > new set of dma_map* APIs to work on physical addresses and all the > > other stuff needed. > > Yeah, that's what I totally agree on. And I actually hoped that the new P2P > work for PCIe would go into that direction, but that didn't materialized. It is a lot of work and the only gain is to save a bit of memory for struct pages. Not a very big pay off. > But allocating struct pages for PCIe BARs which are essentially registers > and not memory is much more hacky than the dma_resource_map() approach. It doesn't really matter. The pages are in a special zone and are only being used as handles for the BAR memory. > By using PCIe P2P we want to avoid the round trip to the CPU when one device > has filled the ring buffer and another device must be woken up to process > it. Sure, we all have these scenarios, what is inside the memory doesn't realy matter. The mechanism is generic and the struct pages don't care much if they point at something memory-like or at something register-like. They are already in big trouble because you can't portably use CPU instructions to access them anyhow. Jason