From: Paul E. McKenney > Sent: 21 October 2015 00:35 ... > There is also the question of whether the barrier forces ordering > of unrelated stores, everything initially zero and all accesses > READ_ONCE() or WRITE_ONCE(): > > P0 P1 P2 P3 > X = 1; Y = 1; r1 = X; r3 = Y; > some_barrier(); some_barrier(); > r2 = Y; r4 = X; > > P2's and P3's ordering could be globally visible without requiring > P0's and P1's independent stores to be ordered, for example, if you > used smp_rmb() for some_barrier(). In contrast, if we used smp_mb() > for barrier, everyone would agree on the order of P0's and P0's stores. > > There are actually a fair number of different combinations of > aspects of memory ordering. We will need to choose wisely. ;-) My thoughts on this are that most code probably isn't performance critical enough to be using anything other than normal locks for inter-cpu synchronisation. Certainly most people are likely to get it wrong somewhere. So you want a big red sticker saying 'Don't try to be too clever'. Also without examples of why things go wrong (eg member_consumer() and alpha) it is difficult to understand the differences between all the barriers (etc). OTOH device driver code may need things slightly stronger than barrier() (which I think is asm(:::"memory")) to sequence accesses to hardware devices (and memory the hardware reads), but without having a strong barrier in every ioread/write() access. David ��.n��������+%������w��{.n�����{�����ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f