On Mon, 20 Aug 2018, Mikulas Patocka wrote: > > > I observed that not every kernel with the patch > > > 92d7223a74235054f2aa7227d207d9c57f84dca0 fails, some of them get stuck > > > only at boot, some get stuck only at shutdown, some not at all. Although > > > all the kernels with this patch reverted work. > > > > > > So the patch may have uncovered some timing problem somewhere. > > > > > > x86 has the function io_delay that injects delays between I/O accesses for > > > hardware that needs it - does alpha have something like this? > > > > The I/O delay would be very low on my list of possible root causes > > for this, hardly any hardware at all relies on it, and all uses I see > > are related to outb(), which you've already shown not to be the problem > > with my test patch. > > The lockup happens somewhere in the function autoconfig in > drivers/tty/serial/8250/8250_port.c, but I don't know where exactly > because serial console doesn't work while the port is being probed. I've had a look at commit 92d7223a7423 ("alpha: io: reorder barriers to guarantee writeX() and iowriteX() ordering #2") and I note this: memory-barriers.txt has been updated with the following requirement. "When using writel(), a prior wmb() is not needed to guarantee that the cache coherent memory writes have completed before writing to the MMIO region." which clearly describes an ordering barrier implication between a memory write and a subsequent MMIO write issued with `writeX', e.g. a scenario corresponding to updating one or more DMA descriptors in memory followed by a write to a device's MMIO register to fire the DMA engine in order to interpret the DMA descriptor ring changes. It does not mention anything about the barriers between MMIO accesses that we used to have after `writeX' operations though. Therefore I bet there's a barrier missing from the 8250 serial driver that these barriers in `writeX' accessors covered. As a quick fix we can restore the trailing barriers (although as I noted in the other e-mail they are superfluous according to io_ordering.txt) while keeping the leading barriers as well, but I think the problem with the 8250 serial driver should be narrowed down too (to make the driver only expect the semantics guaranteed by io_ordering.txt and not more), probably by careful eyeballing. This may be debuggable too, by having the barrier issued conditionally depending on `addr' and `__builtin_return_address' along with `printk' may help narrowing this down further. Also according to the update to memory-barriers.txt quoted the leading barriers in `writeX' ought to be weaker `wmb' operations rather than `mb' as we are not supposed to care about r/w ordering here. Maciej