Re: [PATCH 1/2 v3] alpha: add a delay to inb_p, inb_w and inb_l

Mikulas Patocka <mpatocka@xxxxxxxxxx> · Fri, 22 May 2020 16:00:41 -0400 (EDT)

On Fri, 22 May 2020, Mikulas Patocka wrote:

> On Wed, 13 May 2020, Ivan Kokshaysky wrote:
> 
> > On Mon, May 11, 2020 at 03:58:24PM +0100, Maciej W. Rozycki wrote:
> > >  Individual PCI port locations correspond to different MMIO locations, so 
> > > yes, accesses to these can be reordered (merging won't happen due to the 
> > > use of the sparse address space).
> > 
> > Correct, it's how Alpha write buffers work. According to 21064 hardware
> > reference manual, these buffers are flushed when one of the following
> > conditions is met:
> > 
> > 1) The write buffer contains at least two valid entries.
> > 2) The write buffer contains one valid entry and at least 256 CPU cycles
> >    have elapsed since the execution of the last write buffer-directed
> >    instruction.
> > 3) The write buffer contains an MB, STQ_C or STL_C instruction.
> > 4) A load miss is pending to an address currently valid in the write
> >    buffer that requires the write buffer to be flushed.
> > 
> > I'm certain that in these rtc/serial cases we've got readX arriving
> > to device *before* preceeding writeX because of 2). That's why small
> > delay (300-1400 ns, apparently depends on CPU frequency) seemingly
> > "fixes" the problem. The 4) is not met because loads and stores are
> > to different ports, and 3) has been broken by commit 92d7223a74.
> > 
> > So I believe that correct fix would be to revert 92d7223a74 and
> > add wmb() before [io]writeX macros to meet memory-barriers.txt
> > requirement. The "wmb" instruction is cheap enough and won't hurt
> > IO performance too much.
> > 
> > Ivan.
> 
> I agree ... and what about readX_relaxed and writeX_relaxed? According to 
> the memory-barriers specification, the _relaxed functions must be ordered 
> w.r.t. each other. If Alpha can't keep them ordered, they should have 
> barriers between them too.
> 
> Mikulas

I have found the chapter about I/O in the Alpha reference manual, in the 
section "5.6.4.7 Implications for Memory Mapped I/O", it says:

To reliably communicate shared data from a processor to an I/O device:
1. Write the shared data to a memory-like physical memory region on the 
processor.
2. Execute an MB instruction.
3. Write a flag (equivalently, send an interrupt or write a register 
location implemented in the I/O device).

The receiving I/O device must:
1. Read the flag (equivalently, detect the interrupt or detect the write 
to the register location implemented in the I/O device).
2. Execute the equivalent of an MB.
3. Read the shared data.

So, we must use MB, not WMB when ordering I/O accesses.

The WMB instruction specification says that it orders writes to 
memory-like locations with other writes to memory-like locations. And 
writes to non-memory-like locations with other writes to non-memory-like 
locations. But it doesn't order memory-like writes with I/O-like writes.

The section 5.6.3 claims that there are no implied barriers.

So, if we want to support the specifications:

read*_relaxed and write*_relaxed need at least one barrier before or after 
(memory-barriers.txt says that they must be ordered with each other, the 
alpha specification doesn't specify ordering).

read and write - read must have a barrier after it, write before it. There 
must be one more barrier before a read or after a write, to make sure that 
there is barrier between write+read.

Mikulas