On Mon, Mar 18, 2024, at 05:30, Manivannan Sadhasivam wrote: > On Fri, Mar 15, 2024 at 06:29:52PM +0100, Arnd Bergmann wrote: >> On Fri, Mar 15, 2024, at 07:44, Manivannan Sadhasivam wrote: >> > On Wed, Mar 13, 2024 at 11:58:01AM +0100, Niklas Cassel wrote: > > But I'm not sure I got the answer I was looking for. So let me rephrase my > question a bit. > > For BAR memory, PCIe spec states that, > > 'A PCI Express Function requesting Memory Space through a BAR must set the BAR's > Prefetchable bit unless the range contains locations with read side effects or > locations in which the Function does not tolerate write merging' > > So here, spec refers the backing memory allocated on the endpoint side as the > 'range' i.e, the BAR memory allocated on the host that gets mapped on the > endpoint. > > Currently on the endpoint side, we use dma_alloc_coherent() to allocate the > memory for each BAR and map it using iATU. > > So I want to know if the memory range allocated in the endpoint through > dma_alloc_coherent() satisfies the above two conditions in PCIe spec on all > architectures: > > 1. No Read side effects > 2. Tolerates write merging > > I believe the reason why we are allocating the coherent memory on the endpoint > first up is not all PCIe controllers are DMA coherent as you said above. As far as I can tell, we never have read side effects for memory backed BARs, but the write merging is something that depends on how the memory is used: If you have anything in that memory that relies on ordering, you probably want to map it as coherent on the endpoint side, and non-prefetchable on the host controller side, and then use the normal rmb()/wmb() barriers on both ends between serialized accesses. An example of this would be having blocks of data separate from metadata that says whether the data is valid. If you don't care about ordering on that level, I would use dma_map_sg() on the endpoint side and prefetchable mapping on the host side, with the endpoint using dma_sync_*() to pass buffer ownership between the two sides, as controlled by some other communication method (non-prefetchable BAR, MSI, ...). Arnd