Nicholas Piggin <npiggin@xxxxxxxxx> writes: > Will Deacon's on March 2, 2019 12:03 am: >> In preparation for removing all explicit mmiowb() calls from driver >> code, implement a tracking system in asm-generic based loosely on the >> PowerPC implementation. This allows architectures with a non-empty >> mmiowb() definition to have the barrier automatically inserted in >> spin_unlock() following a critical section containing an I/O write. > > Is there a reason to call this "mmiowb"? We already have wmb that > orders cacheable stores vs mmio stores don't we? > > Yes ia64 "sn2" is broken in that case, but that can be fixed (if > anyone really cares about the platform any more). Maybe that's > orthogonal to what you're doing here, I just don't like seeing > "mmiowb" spread. > > This series works for spin locks, but you would want a driver to > be able to use wmb() to order locks vs mmio when using a bit lock > or a mutex or whatever else. Calling your wmb-if-io-is-pending > version io_mb_before_unlock() would kind of match with existing > patterns. > >> +static inline void mmiowb_set_pending(void) >> +{ >> + struct mmiowb_state *ms = __mmiowb_state(); >> + ms->mmiowb_pending = ms->nesting_count; >> +} >> + >> +static inline void mmiowb_spin_lock(void) >> +{ >> + struct mmiowb_state *ms = __mmiowb_state(); >> + ms->nesting_count++; >> +} >> + >> +static inline void mmiowb_spin_unlock(void) >> +{ >> + struct mmiowb_state *ms = __mmiowb_state(); >> + >> + if (unlikely(ms->mmiowb_pending)) { >> + ms->mmiowb_pending = 0; >> + mmiowb(); >> + } >> + >> + ms->nesting_count--; >> +} > > Humour me for a minute and tell me what this algorithm is doing, or > what was broken about the powerpc one, which is basically: > > static inline void mmiowb_set_pending(void) > { > struct mmiowb_state *ms = __mmiowb_state(); > ms->mmiowb_pending = 1; > } > > static inline void mmiowb_spin_lock(void) > { > } The current powerpc code clears io_sync in spin_lock(). ie, it would be equivalent to: static inline void mmiowb_spin_lock(void) { ms->mmiowb_pending = 0; } Which means that: spin_lock(a); writel(x, y); spin_lock(b); ... spin_unlock(b); spin_unlock(a); Does no barrier. cheers