On Tue, 21 Aug 2018, Arnd Bergmann wrote: > On Tue, Aug 21, 2018 at 3:40 PM Mikulas Patocka <mpatocka@xxxxxxxxxx> wrote: > > On Tue, 21 Aug 2018, Arnd Bergmann wrote: > > > On Mon, Aug 20, 2018 at 11:42 PM Mikulas Patocka <mpatocka@xxxxxxxxxx> wrote: > > > > On Mon, 20 Aug 2018, Arnd Bergmann wrote: > > > > > On Mon, Aug 20, 2018 at 4:17 PM Mikulas Patocka <mpatocka@xxxxxxxxxx> wrote: > > > > > > On Sun, 19 Aug 2018, okaya@xxxxxxxxxxxxxx wrote: > > > > > > Ok, this does strongly suggest that it is the outb() operation that I > > > suspected after all, I just sent you a wrong patch to test, failing > > > to realize that alpha has two implementations of outb, and that the > > > extern one is the one that gets used in a defconfig build. > > > > > > Could you try again with this patch added in? (Sorry for the whitespace > > > damage, you'll have to apply it by hand). Presumably a wmb() > > > is sufficient here, but I'm trying to play safe here by restoring the > > > barrier that was part of outb() before it broke. > > > > > > Arnd > > > > This patch fixes both hangs. > > Ok, thanks for confirming. Now the question is whether this is only needed > for I/O space writes to ensure that an outb() etc completes before we start > the next instruction as in my first theory above, or if the readl() > getting moved > ahead of a prior writel() as Maciej suggested could also happen for > memory space. > > My guess would be that even on something as weakly ordered as Alpha, > the PCI semantics are kept that a load from a non-prefetchable PCI MMIO > space can not get moved ahead of a preceding store to the same PCI > device from the same CPU, but I don't really know enough about alpha. > > Arnd It's hard to tell. The Alpha manual says that only overlapping accesses are ordered. I did some tests on framebuffer and found out that "read+read+write+write" is faster than "read+write+read+write" - that may suggest that the reads flush the write queue. arm and arm64 is also using barrier after read and before write. Mikulas