On Wed, Jun 20, 2018 at 09:17:16AM +0100, Will Deacon wrote: > On Wed, Jun 20, 2018 at 11:31:55AM +0800, 陈华才 wrote: > > Loongson-3's Store Fill Buffer is nearly the same as your "Store Buffer", > > and it increases the memory ordering weakness. So, smp_cond_load_acquire() > > only need a __smp_mb() before the loop, not after every READ_ONCE(). In > > other word, the following code is just OK: > > > > #define smp_cond_load_acquire(ptr, cond_expr) \ > > ({ \ > > typeof(ptr) __PTR = (ptr); \ > > typeof(*ptr) VAL; \ > > __smp_mb(); \ > > for (;;) { \ > > VAL = READ_ONCE(*__PTR); \ > > if (cond_expr) \ > > break; \ > > cpu_relax(); \ > > } \ > > __smp_mb(); \ > > VAL; \ > > }) > > > > the __smp_mb() before loop is used to avoid "reads prioritised over > > writes", which is caused by SFB's weak ordering and similar to ARM11MPCore > > (mentioned by Will Deacon). > > Sure, but smp_cond_load_acquire() isn't the only place you'll see this sort > of pattern in the kernel. In other places, the only existing arch hook is > cpu_relax(), so unless you want to audit all loops and add a special > MIPs-specific smp_mb() to those that are affected, I think your only option > is to stick it in cpu_relax(). > > I assume you don't have a control register that can disable this > prioritisation in the SFB? Right, I think we also want to clarify that this 'feature' is not supported by the Linux kernel in general and LKMM in specific. It really is a CPU bug. And the cpu_relax() change is a best effort work-around.