On Tue, Aug 8, 2023, at 09:03, mani wrote: > On Mon, Aug 07, 2023 at 08:28:30PM +0800, Li Chen wrote: >> >> Currently, the EPF's bar is allocated by pci_epf_alloc_space, which internally uses dma_alloc_coherent and the caching behavior of dma_alloc_coherent may vary depending on platforms. >> >> The bar space is exported to RC, which means that RC may modify it without EP being aware of it, so EP still read from the cache and get stalled data. To address this issue, the bar space should be treated as iomem instead and forced to use io read/write APIs, which enforces volatile. >> > > We already had a similar discussion on using volatile for BAR space and settled > with {WRITE/READ}_ONCE macros in EPF Test driver [1]. > > Since the BAR space allocated in endpoint is not a MMIO, I don't think it should > be forced as iomem. Rather EPF drivers should use _ONCE macros to access the > fields to avoid coherency issues as suggested by Arnd earlier. Using readl/writel is clearly the wrong solution here as I explained before, but I assume that Li Chen is dealing with a real problem. If the cache is coherent with the device, then reading from the cache is clearly the right thing to do, but the mentioned "stall" problem may be related to the store buffers, where an dma_wmb() after the WRITE_ONCE() is missing. Similarly, a dma_rmb() might be missing before a READ_ONCE() to prevent prefetching during out-of-order execution. With readl()/writel(), you already get very heavy barriers, so it may end up working by accident, but these barriers are at the other side of the access (before writel and after readl) and may be the wrong type of barrier depending on the CPU. Arnd