On 01/15/2016 01:57 AM, Will Deacon wrote:
Paul, I think you figured this out while I was sleeping, but just to confirm: 1. The MIPS64 ISA doc [1] talks about SYNC in a way that applies only to memory accesses appearing in *program-order* before the SYNC 2. We need WRC+sync+addr to work, which means that the SYNC in P1 must also capture the store in P0 as being "before" the barrier. Leonid reckons it works, but his explanation [2] focussed on the address dependency in P2 as to why this works. If that is the case (i.e. address dependency provides global transitivity), then WRC+addr+addr should also work (even though its not required).
No, it is not correct. There is one old design which provides access to core (thread0 + thread1) write-buffers for threads load in advance of it is visible to other cores. It means, that WRC+sync+addr passes because of SYNC in write thread and register dependency inside other thread but WRC+addr+addr may fail because other core may get a stale data.
3. It seems that WRC+addr+addr doesn't work, so I'm still suspicious about WRC+sync+addr, because neither the architecture document or Leonid's explanation tell me that it should be forbidden. Will [1] https://imgtec.com/?do-download=4302 [2] http://lkml.kernel.org/r/569565DA.2010903@xxxxxxxxxx (scroll to the end)