On 2021/1/29 18:33, Russell King - ARM Linux admin wrote: > On Fri, Jan 29, 2021 at 11:26:38AM +0100, Arnd Bergmann wrote: >> Another clarification, as there are actually two independent >> points here: >> >> * if you can completely remove the readl() above and just write a >> hardcoded value into the register, or perhaps read the original >> value once at boot time, that is probably a win because it >> avoids one of the barriers in the beginning. The datasheet should >> tell you if there are any bits in the register that have to be >> preserved >> >> * Regarding the _relaxed() accessors, it's a lot harder to know >> whether that is safe, as you first have to show, in particular in case >> any of the accesses stop being guarded by the spinlock in that >> case, and whether there may be a case where you have to >> serialize the memory access against accesses that are still in the >> store queue or prefetched. >> >> Whether this matters at all depends mostly on the type of devices >> you are driving on your SoC. If you have any high-speed network >> interfaces that are unable to do cache coherent DMA, any extra >> instruction here may impact the number of packets you can transfer, >> but if all your high-speed devices are connected to a coherent >> interconnect, I would just go with the obvious approach and use >> the safe MMIO accessors everywhere. > > For L2 cache code, I would say the opposite, actually, because it is > all too easy to get into a deadlock otherwise. > > If you implement the sync callback, that will be called from every > non-relaxed accessor, which means if you need to take some kind of > lock in the sync callback and elsewhere in the L2 cache code, you will > definitely deadlock. > > It is safer to put explicit barriers where it is necessary. > > Also remember that the barrier in readl() etc is _after_ the read, not > before, and the barrier in writel() is _before_ the write, not after. > The point is to ensure that DMA memory accesses are properly ordered > with the IO-accessing instructions. Yes, I known it. writel() must be used for the write operations that control "start/stop" or "enable/disable" function, to ensure that the data of previous write operations reaches the target. I've met this kind of problem before. > > So, using readl_relaxed() with a read-modify-write is entirely sensible > provided you do not access DMA memory inbetween. Actually, I don't think this register is that complicated. I copied the code back below. All the bits of L3_MAINT_CTRL are not affected by DMA access operations. The software change the "range | op_type" to specify the operation type and scope, the set the bit "L3_MAINT_STATUS_START" to start the operation. Then wait for that bit to change from 1 to 0 by hardware. + reg = readl(l3_ctrl_base + L3_MAINT_CTRL); + reg &= ~(L3_MAINT_RANGE_MASK | L3_MAINT_TYPE_MASK); + reg |= range | op_type; + reg |= L3_MAINT_STATUS_START; + writel(reg, l3_ctrl_base + L3_MAINT_CTRL); + + /* Wait until the hardware maintenance operation is complete. */ + do { + cpu_relax(); + reg = readl(l3_ctrl_base + L3_MAINT_CTRL); + } while ((reg & L3_MAINT_STATUS_MASK) != L3_MAINT_STATUS_END); >