On Thu, May 7, 2020 at 4:09 PM Mikulas Patocka <mpatocka@xxxxxxxxxx> wrote: > On Thu, 7 May 2020, Arnd Bergmann wrote: > > On Thu, May 7, 2020 at 10:06 AM Mikulas Patocka <mpatocka@xxxxxxxxxx> wrote: > > > > Are you sure that it is in fact the timing that is important here and not > > a barrier? I see that inb() is written in terms of readb(), but the > > barrier requirements for I/O space are a bit different from those > > on PCI memory space. > > The "in" and "out" instructions are serializing on x86. But alpha doesn't > have dedicated instructions for accessing ports. > > Do you think that all the "in[bwl]" and "out[bwl]" macros on alpha should > be protected by two memory barriers, to emulate the x86 behavior? That's what we do on some other architectures to emulate the non-posted behavior of out[bwl], as required by PCI. I can't think of any reasons to have a barrier before in[bwl], or after write[bwl], but we generally want one after out[bwl] > > In the example you gave first, there is a an outb_p() followed by inb_p(). > > These are normally serialized by the bus, but I/O space also has the > > requirement that an outb() completes before we get to the next > > instruction (non-posted write), while writeb() is generally posted and > > only needs a barrier before the write rather than both before and after > > like outb. > > I think that the fact that "writeb" is posted is exactly the problem - it > gets posted, the processor goes on, sends "readb" and they arrive > back-to-back to the ISA bus. The ISA bus device doesn't like back-to-back > accesses and locks up. > > Anyway - you can change the "ndelay()" function in this patch to "mb()" - > "mb()" will provide long enough delay that it fixes this bug. My preference would be to have whatever makes most sense in theory and also fixes the problem. If there is some documentation that says you need a certain amount of time between accesses regardless of the barriers, then that is fine. I do wonder if there is anything enforcing the "rpcc" in _delay() to come after the store if there is no barrier between the two, otherwise the delay method still seems unreliable. The barrier after the store at least makes sense to me based on the theory, both with and without a delay in outb_p(). Arnd