On Wed, Jul 11, 2018 at 06:05:51PM +0800, Jiaxun Yang wrote: > On 2018-7-10 Tue at 20:17:27,Peter Zijlstra Wrote: > > Hi Peter > Since Huacai unable to send email via client, I'm going to reply for him > > > Sure.. we all got that far. And no, this isn't the _real_ problem. This > > is a manifestation of the problem. > > > > The problem is that your SFB is broken (per the Linux requirements). We > > require that stores will become visible. That is, they must not > > indefinitely (for whatever reason) stay in the store buffer. > > > > > I don't think this is a hardware bug, in design, SFB will flushed to > > > L1 cache in three cases: > > > > > > 1, data in SFB is full (be a complete cache line); > > > 2, there is a subsequent read access in the same cache line; > > > 3, a 'sync' instruction is executed. > > > > And I think this _is_ a hardware bug. You just designed the bug instead > > of it being by accident. > Yes, we understood that this hardware feature is not supported by LKML, > so it should be a hardware bug for LKML. > > > > It doesn't happen an _any_ other architecture except that dodgy > > ARM11MPCore part. Linux hard relies on stores to become available > > _eventually_. > > > > Still, even with the rules above, the best work-around is still the very > > same cpu_relax() hack. > > As you say, SFB makes Loongson not fully SMP-coherent. > However, modify cpu_relax can solve the current problem, > but not so straight forward. On the other hand, providing a Loongson-specific > WRITE_ONCE looks more reasonable, because it the eliminate the "non-cohrency". > So we can solve the bug from the root. Curious, but why is it not straight-forward to hack cpu_relax()? If you try to hack WRITE_ONCE, you also need to hack atomic_set, atomic64_set and all the places that should be using WRITE_ONCE but aren't ;~) Will