On Tue, Jul 27, 2021 at 10:29:59AM +0800, Boqun Feng wrote: > > "How to implement xchg_tail" shouldn't force with _Q_PENDING_BITS, but > > the arch could choose. > > I actually agree with this part, but this patchset failed to provide > enough evidences on why we should choose xchg_tail() implementation > based on whether hardware has xchg16, more precisely, for an archtecture > which doesn't have a hardware xchg16, why cmpxchg emulated xchg16() is > worse than a "load+cmpxchg) implemeneted xchg_tail()? If it's a > performance reason, please show some numbers. Right. Their problem is their broken xchg16() implementation.