Re: [PATCH stable] MIPS: Loongson: Introduce and use loongson_llsc_mb()

Jiaxun Yang <jiaxun.yang@xxxxxxxxxxx> · Sat, 01 Aug 2020 19:48:48 +0800

于 2020年8月1日 GMT+08:00 下午6:26:46, Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> 写到:
>On Sat, Aug 01, 2020 at 02:34:43PM +0800, Jiaxun Yang wrote:
>> From: Huacai Chen <chenhc@xxxxxxxxxx>
>> 
>> commit e02e07e3127d8aec1f4bcdfb2fc52a2d99b4859e upstream.
>> 
>> On the Loongson-2G/2H/3A/3B there is a hardware flaw that ll/sc and
>> lld/scd is very weak ordering. We should add sync instructions "before
>> each ll/lld" and "at the branch-target between ll/sc" to workaround.
>> Otherwise, this flaw will cause deadlock occasionally (e.g. when doing
>> heavy load test with LTP).
>> 
>> Below is the explaination of CPU designer:
>> 
>> "For Loongson 3 family, when a memory access instruction (load, store,
>> or prefetch)'s executing occurs between the execution of LL and SC, the
>> success or failure of SC is not predictable. Although programmer would
>> not insert memory access instructions between LL and SC, the memory
>> instructions before LL in program-order, may dynamically executed
>> between the execution of LL/SC, so a memory fence (SYNC) is needed
>> before LL/LLD to avoid this situation.
>> 
>> Since Loongson-3A R2 (3A2000), we have improved our hardware design to
>> handle this case. But we later deduce a rarely circumstance that some
>> speculatively executed memory instructions due to branch misprediction
>> between LL/SC still fall into the above case, so a memory fence (SYNC)
>> at branch-target (if its target is not between LL/SC) is needed for
>> Loongson 3A1000, 3B1500, 3A2000 and 3A3000.
>> 
>> Our processor is continually evolving and we aim to to remove all these
>> workaround-SYNCs around LL/SC for new-come processor."
>> 
>> Here is an example:
>> 
>> Both cpu1 and cpu2 simutaneously run atomic_add by 1 on same atomic var,
>> this bug cause both 'sc' run by two cpus (in atomic_add) succeed at same
>> time('sc' return 1), and the variable is only *added by 1*, sometimes,
>> which is wrong and unacceptable(it should be added by 2).
>> 
>> Why disable fix-loongson3-llsc in compiler?
>> Because compiler fix will cause problems in kernel's __ex_table section.
>> 
>> This patch fix all the cases in kernel, but:
>> 
>> +. the fix at the end of futex_atomic_cmpxchg_inatomic is for branch-target
>> of 'bne', there other cases which smp_mb__before_llsc() and smp_llsc_mb() fix
>> the ll and branch-target coincidently such as atomic_sub_if_positive/
>> cmpxchg/xchg, just like this one.
>> 
>> +. Loongson 3 does support CONFIG_EDAC_ATOMIC_SCRUB, so no need to touch
>> edac.h
>> 
>> +. local_ops and cmpxchg_local should not be affected by this bug since
>> only the owner can write.
>> 
>> +. mips_atomic_set for syscall.c is deprecated and rarely used, just let
>> it go
>> 
>> Signed-off-by: Huacai Chen <chenhc@xxxxxxxxxx>
>> Signed-off-by: Huang Pei <huangpei@xxxxxxxxxxx>
>> [paul.burton@xxxxxxxx:
>>   - Simplify the addition of -mno-fix-loongson3-llsc to cflags, and add
>>     a comment describing why it's there.
>>   - Make loongson_llsc_mb() a no-op when
>>     CONFIG_CPU_LOONGSON3_WORKAROUNDS=n, rather than a compiler memory
>>     barrier.
>>   - Add a comment describing the bug & how loongson_llsc_mb() helps
>>     in asm/barrier.h.]
>> Signed-off-by: Paul Burton <paul.burton@xxxxxxxx>
>> Signed-off-by: Jiaxun Yang <jiaxun.yang@xxxxxxxxxxx>
>> Cc: Ralf Baechle <ralf@xxxxxxxxxxxxxx>
>> Cc: ambrosehua@xxxxxxxxx
>> Cc: Steven J . Hill <Steven.Hill@xxxxxxxxxx>
>> Cc: linux-mips@xxxxxxxxxxxxxx
>> Cc: Fuxin Zhang <zhangfx@xxxxxxxxxx>
>> Cc: Zhangjin Wu <wuzhangjin@xxxxxxxxx>
>> Cc: Li Xuefeng <lixuefeng@xxxxxxxxxxx>
>> Cc: Xu Chenghua <xuchenghua@xxxxxxxxxxx>
>> Cc: stable@xxxxxxxxxxxxxxx # 4.19
>> 
>> ---
>> Backport to stable according to request from Debian downstream.
>
>What do you mean by "request"?

Debian guys asked us to backport this to ensure the system stability on "buster" release if possible.

>
>This feels like a new feature, why can't people just use the 5.4 kernel
>or newer?  Given that this issue has been fixed upstream for 1 1/2
>years, why does it need to go to the 4.19.y stable kernel now?

It is a workaround of certain hardware bug...
Just because we've been asked by downstren why 4.19 can't run flawlessly on certain systems...

Thanks.

>
>thanks,
>
>greg k-h

-- 
Jiaxun Yang