On Sat, Aug 18, 2018 at 12:05 AM Maciej W. Rozycki <macro@xxxxxxxxxxxxxx> wrote: > > On Fri, 17 Aug 2018, Arnd Bergmann wrote: > > > - For outb()/outw()/outl(), we ought to provide stronger barriers than for > > writeb()/writew()/writel(), as PCI drivers should expect the store to have > > arrived at the device by the time the function returns. > > > > Could you try the patch below that would address the second issue but not > > the first? > > > > Arnd > > > > diff --git a/arch/alpha/include/asm/io.h b/arch/alpha/include/asm/io.h > > index 4c533fc94d62..0a479ac27cab 100644 > > --- a/arch/alpha/include/asm/io.h > > +++ b/arch/alpha/include/asm/io.h > > @@ -364,11 +364,13 @@ extern inline u16 inw(unsigned long port) > > extern inline void outb(u8 b, unsigned long port) > > { > > iowrite8(b, ioport_map(port, 1)); > > + mb(); > > But is `mb' the right kind of high-level barrier for MMIO rather than > regular memory access? > > I believe it is not and so I was told in the past when an ordering > barrier was required for MMIO in a network driver for a weakly ordered > system. I think different kind of barrier is required here, such as > `mmiowb', if appropriate. Of course at the low level it may indeed expand > to an MB hardware instruction, but that's another matter, we need to > abstract things properly. What's important here is that for this experiment, adding mb() puts the barrier back that used to be there before the regression in commit 92d7223a74235. Once we know whether that helps, we can debate what the problem is and how it should be addressed correctly. > A while ago I proposed a set of different MMIO barriers, that some > systems may require, corresponding to the respective regular memory > barriers, but in the I/O context. I never got to implementing that > proposal, but I still think it's the right thing to do and will see if I > can find some time to try doing that. Right now we have quite a mess with > that. :( My understanding for mmiowb() is that it should not be implied by writel() itself but that it would be added between a writel() and a spin_unlock(). Putting it into writel() itself would make it completely pointless as an abstraction. On some architectures, we have iowmb()/iormb() as a subset of wmb()/rmb(), and those are the ones that are implied by readl()/writel() so serialize them against DMA (but not against spinlocks). Arnd