Re: [PATCH 0/3] MIPS: SMP memory barriers: lightweight sync, acquire-release

Joshua Kinard <kumba@xxxxxxxxxx> · Tue, 02 Jun 2015 14:59:38 -0400

On 06/02/2015 05:59, Ralf Baechle wrote:
> On Tue, Jun 02, 2015 at 04:41:21AM -0400, Joshua Kinard wrote:
> 
>> On 06/01/2015 20:09, Leonid Yegoshin wrote:
>>> The following series implements lightweight SYNC memory barriers for SMP Linux
>>> and a correct use of SYNCs around atomics, futexes, spinlocks etc LL-SC loops -
>>> the basic building blocks of any atomics in MIPS.
>>>
>>> Historically, a generic MIPS doesn't use memory barriers around LL-SC loops in
>>> atomics, spinlocks etc. However, Architecture documents never specify that LL-SC
>>> loop creates a memory barrier. Some non-generic MIPS vendors already feel
>>> the pain and enforces it. With introduction in a recent out-of-order superscalar
>>> MIPS processors an aggressive speculative memory read it is a problem now.
>>>
>>> The generic MIPS memory barrier instruction SYNC (aka SYNC 0) is something
>>> very heavvy because it was designed for propogating barrier down to memory.
>>> MIPS R2 introduced lightweight SYNC instructions which correspond to smp_*()
>>> set of SMP barriers. The description was very HW-specific and it was never
>>> used, however, it is much less trouble for processor pipelines and can be used
>>> in smp_mb()/smp_rmb()/smp_wmb() as is as in acquire/release barrier semantics.
>>> After prolonged discussions with HW team it became clear that lightweight SYNCs
>>> were designed specifically with smp_*() in mind but description is in timeline
>>> ordering space.
>>>
>>> So, the problem was spotted recently in engineering tests and it was confirmed
>>> with tests that without memory barrier load and store may pass LL/SC
>>> instructions in both directions, even in old MIPS R2 processors.
>>> Aggressive speculation in MIPS R6 and MIPS I5600 processors adds more fire to
>>> this issue.
>>>
>>> 3 patches introduces a configurable control for lightweight SYNCs around LL/SC
>>> loops and for MIPS32 R2 it was allowed to choose an enforcing SYNCs or not
>>> (keep as is) because some old MIPS32 R2 may be happy without that SYNCs.
>>> In MIPS R6 I chose to have SYNC around LL/SC mandatory because all of that
>>> processors have an agressive speculation and delayed write buffers. In that
>>> processors series it is still possible the use of SYNC 0 instead of
>>> lightweight SYNCs in configuration - just in case of some trouble in
>>> implementation in specific CPU. However, it is considered safe do not implement
>>> some or any lightweight SYNC in specific core because Architecture requires
>>> HW map of unimplemented SYNCs to SYNC 0.
>>
>> How useful might this be for older hardware, such as the R10k CPUs?  Just
>> fallbacks to the old sync insn?
> 
> The R10000 family is strongly ordered so there is no SYNC instruction
> required in the entire kernel even though some Origin hardware documentation
> incorrectly claims otherwise.

So no benefits even in the speculative execution case on noncoherent hardware
like IP28 and IP32?

--J