On Fri, Sep 11, 2020 at 10:39:16AM +1000, Benjamin Herrenschmidt wrote: > Yes, "write combine" isn't a good name.... The goal is to get WC but it > comes with the whole package on several archs. We don't even have a > reasonnable definition of the semantics of readl/writel on a WC mapping > (hint: on powerpc the barriers in them will prevent WC even on a WC > mapping) nor of what barriers might work and how on such a mapping. Yes, you can't really use WC properly in the kernel, we don't have the infrastructure for it. mlx5 is using __raw_writeq() and wmb() to hack something ugly together in the kernel. A useful API for the message method, similar to what we use in userspace, is something like: /* * Almost always need a spinlock as multiple CPUs cannot write * concurrently. */ spin_lock(); /* * Ensure that all DMA visiable writes in program order are visible * to DMA before the WC TLP is sent. */ barrier_wc_after_lock(); /* Generate the TLP */ write_wc_message(wc_iomem, message, len); /* * Writes to wc_iomem past this, by any CPU, cannot replace writes * already done in wc_message. */ barrier_wc_before_unlock(); spin_unlock(); And another varient without the spinlock for stuff that can be per-CPU for a range of WC memory. (oh actually I see most drivers are using ioremap_wc(), and there is a bunch of them including an Amazon ethernet device...) Jason