[sorry, thought I replied on this thread already but my wifi is flakey] On Wed, Aug 22, 2018 at 04:06:09PM -0400, Sinan Kaya wrote: > On 8/22/2018 3:56 PM, Mikulas Patocka wrote: > > > > > >On Wed, 22 Aug 2018, Sinan Kaya wrote: > > > >>On 8/22/2018 1:47 PM, Mikulas Patocka wrote: > >>>If ARM guarantees that the accesses to a given device are not reordered - > >>>then the barriers in readl and writel are superfluous. > >> > >>It is not. ARM only guarantees ordering of read/write transactions targeting > >>a device not memory. > >> > >>example: > >> > >>write memory > >>raw write to device > >> > >>or > >> > >>raw read from device > >>read memory > >> > >>these can bypass each other on ARM unless a barrier is placed in the right > >>place either via readl()/writel() or explicitly. > > > >Yes - but - why does Linux insert the barriers into readl() and writel() > >instead of inserting them between accesses to registers and memory? > > > >A lot of drivers have long sequences of accesses to memory-mapped > >registers with no interleaving accesses to coherent memory and these > >implicit barriers slow them down with no gain at all. > > It is an abstraction issue. Majority of drivers are developed against x86 > and the developers have no idea about the weakly ordered architecture > implications. Right, and Torvalds was very clear that readX/writeX must follow the x86 semantics here. > Now, Will Deacon added new primitives to address your concern. There are > new APIs as readl_relaxed() and writel_relaxed() as opposed to readl() > and writel(). > > Relaxed version still guarantee of register accesses with respect to each > other but no guaranteed with respect to memory. Relaxed versions could > be used in performance critical path. Yes, and the heavy ordering requirements of plain readX/writeX were exactly what motivated the addition of the _relaxed forms. Will