On Sun, 24 May 2020, Maciej W. Rozycki wrote: > Hi Mikulas, > > > This patch makes barriers confiorm to the specification. > > > > 1. We add mb() before readX_relaxed and writeX_relaxed - > > memory-barriers.txt claims that these functions must be ordered w.r.t. > > each other. Alpha doesn't order them, so we need an explicit barrier. > > 2. We add mb() before reads from the I/O space - so that if there's a > > write followed by a read, there should be a barrier between them. > > > > Signed-off-by: Mikulas Patocka <mpatocka@xxxxxxxxxx> > > Fixes: cd0e00c10672 ("alpha: io: reorder barriers to guarantee writeX() and iowriteX() ordering") > > Fixes: 92d7223a7423 ("alpha: io: reorder barriers to guarantee writeX() and iowriteX() ordering #2") > > Cc: stable@xxxxxxxxxxxxxxx # v4.17+ > > Acked-by: Ivan Kokshaysky <ink@xxxxxxxxxxxxxxxxxxxx> > > Thank you for your effort to address this regression. I have looked > through your code and the context it is to be applied to. Overall it > looks good to me, however I still have one concern as detailed below > (please accept my apologies if you find it tedious to address all the > points raised in the course of this review). > > > Index: linux-stable/arch/alpha/include/asm/io.h > > =================================================================== > > --- linux-stable.orig/arch/alpha/include/asm/io.h 2020-05-23 10:01:22.000000000 +0200 > > +++ linux-stable/arch/alpha/include/asm/io.h 2020-05-23 17:29:22.000000000 +0200 > [...] > > @@ -487,16 +501,59 @@ extern inline void writeq(u64 b, volatil > > #define outb_p outb > > #define outw_p outw > > #define outl_p outl > > -#define readb_relaxed(addr) __raw_readb(addr) > > -#define readw_relaxed(addr) __raw_readw(addr) > > -#define readl_relaxed(addr) __raw_readl(addr) > > -#define readq_relaxed(addr) __raw_readq(addr) > > -#define writeb_relaxed(b, addr) __raw_writeb(b, addr) > > -#define writew_relaxed(b, addr) __raw_writew(b, addr) > > -#define writel_relaxed(b, addr) __raw_writel(b, addr) > > -#define writeq_relaxed(b, addr) __raw_writeq(b, addr) > > > > /* > > + * The _relaxed functions must be ordered w.r.t. each other, but they don't > > + * have to be ordered w.r.t. other memory accesses. > > + */ > > +static inline u8 readb_relaxed(const volatile void __iomem *addr) > > +{ > > + mb(); > > + return __raw_readb(addr); > > +} > [etc.] > > Please observe that changing the `*_relaxed' entry points from merely > aliasing the corresponding `__raw_*' handlers to more elaborate code > sequences (though indeed slightly only, but still) makes the situation > analogous to one we have with the ordinary MMIO accessor entry points. > Those regular entry points have been made `extern inline' and wrapped > into: > > #if IO_CONCAT(__IO_PREFIX,trivial_rw_bw) == 1 > > and: > > #if IO_CONCAT(__IO_PREFIX,trivial_rw_lq) == 1 > > respectively, with corresponding out-of-line entry points available, so > that there is no extra inline code produced where the call to the relevant > MMIO accessor is going to end up with an actual function call, as this > would not help performance in any way and would expand code unnecessarily > at all call sites. > > Therefore I suggest that your new `static inline' functions follow the > pattern, perhaps by grouping them with the corresponding ordinary accessor > functions in arch/alpha/include/asm/io.h within the relevant existing > #ifdef, and then by making them `extern inline' and providing out-of-line > implementations in arch/alpha/kernel/io.c, with the individual symbols > exported. Within arch/alpha/kernel/io.c the compiler will still inline > code as it sees fit as it already does, e.g. `__raw_readq' might get > inlined in `readq' if it turns out cheaper than arranging for an actual > call, including all the stack frame preparation for `ra' preservation; > it's less likely with say `writeq' which probably always ends with a tail > call to `__raw_writeq' as no stack frame is required in that case. > > That for the read accessors. I think that making the read*_relaxed functions extern inline just causes source code bloat with no practical gain - if we make them extern inline, we would need two implementations (one in the include file, the other in the C file) - and it is not good practice to duplicate code. The functions __raw_read* are already extern inline, so the compiler will inline/noinline them depending on the macros trivial_io_bw and trivial_io_lq - so we can just call them from read*_relaxed without repeating the extern inline pattern. > > +static inline void writeb_relaxed(u8 b, volatile void __iomem *addr) > > +{ > > + mb(); > > + __raw_writeb(b, addr); > > +} > [etc.] > > Conversely for the write accessors, keeping in mind what I have noted > above, I suggest that you just redirect the existing aliases to the > ordinary accessors, as there will be no difference anymore between the > respective ordinary and relaxed accessors. That is: > > #define writeb_relaxed(b, addr) writeb(b, addr) Yes - that's a good point. > etc. > > Let me know if you have any further questions or comments. > > Maciej Mikulas