On Thu, Sep 15, 2022, at 4:18 PM, Parav Pandit wrote: >> From: Arnd Bergmann <arnd@xxxxxxxx> >> Sent: Thursday, September 15, 2022 8:38 AM >> >> On Thu, Sep 15, 2022, at 7:01 AM, Parav Pandit wrote: >> > The cited commit [1] describes that when using writel(), explcit wmb() >> > is not needed. However, it should have said that dma_wmb() is not >> > needed. >> >> Are you sure? As I understand it, the dma_wmb() only serializes a set of >> memory accesses, but does not serialized against an MMIO access, which >> depending on the CPU architecture may require a different type of barrier. >> >> E.g. on arm, writel() uses __iowmb(), which like wmb() is defined as "dsb(x); >> arm_heavy_mb();", while dma_wmb() is a "dmb(oshst)". > > You are right, on arm heavy barrier dsb() is needed, while on arm64, > dmb(oshst) is sufficient. > > So more accurate documentation is to say that > 'when using writel() a prior IO barrier is not needed ...' > > How about that? That's probably fine, not sure if it's worth changing. > It started with my cleanup efforts to two drivers [1] and [2] that had > difficulty in using writel() on 32-bit system, and it ended up open > coding writel() as wmb() + mlx5_write64(). > > I am cleaning up the repetitive pattern of, > wmb(); > mlx5_write64() > > Before I fix drivers, I thought to improve the documentation that I can > follow. :) Right, that is definitely a good idea. I see that there is more going on with that function, at least the loop in post_send_nop() probably just wants to use __iowrite64_copy(), but that also has no barrier in it, while changing mlx5_write64() to use iowrite64be() or similar would of course add excessive barriers inside of the loop. Arnd