On Wed, May 13, 2020 at 03:41:28PM +0100, Ivan Kokshaysky wrote: > On Mon, May 11, 2020 at 03:58:24PM +0100, Maciej W. Rozycki wrote: > > Individual PCI port locations correspond to different MMIO locations, so > > yes, accesses to these can be reordered (merging won't happen due to the > > use of the sparse address space). > > Correct, it's how Alpha write buffers work. According to 21064 hardware > reference manual, these buffers are flushed when one of the following > conditions is met: > > 1) The write buffer contains at least two valid entries. > 2) The write buffer contains one valid entry and at least 256 CPU cycles > have elapsed since the execution of the last write buffer-directed > instruction. > 3) The write buffer contains an MB, STQ_C or STL_C instruction. > 4) A load miss is pending to an address currently valid in the write > buffer that requires the write buffer to be flushed. > > I'm certain that in these rtc/serial cases we've got readX arriving > to device *before* preceeding writeX because of 2). That's why small > delay (300-1400 ns, apparently depends on CPU frequency) seemingly > "fixes" the problem. The 4) is not met because loads and stores are > to different ports, and 3) has been broken by commit 92d7223a74. > > So I believe that correct fix would be to revert 92d7223a74 and > add wmb() before [io]writeX macros to meet memory-barriers.txt > requirement. The "wmb" instruction is cheap enough and won't hurt > IO performance too much. I agree, that sounds easier, and work with the authors of memory-barriers.txt in order to straighten things out. greg k-h