Will Deacon's on March 2, 2019 12:03 am: > In preparation for removing all explicit mmiowb() calls from driver > code, implement a tracking system in asm-generic based loosely on the > PowerPC implementation. This allows architectures with a non-empty > mmiowb() definition to have the barrier automatically inserted in > spin_unlock() following a critical section containing an I/O write. Is there a reason to call this "mmiowb"? We already have wmb that orders cacheable stores vs mmio stores don't we? Yes ia64 "sn2" is broken in that case, but that can be fixed (if anyone really cares about the platform any more). Maybe that's orthogonal to what you're doing here, I just don't like seeing "mmiowb" spread. This series works for spin locks, but you would want a driver to be able to use wmb() to order locks vs mmio when using a bit lock or a mutex or whatever else. Calling your wmb-if-io-is-pending version io_mb_before_unlock() would kind of match with existing patterns. > +static inline void mmiowb_set_pending(void) > +{ > + struct mmiowb_state *ms = __mmiowb_state(); > + ms->mmiowb_pending = ms->nesting_count; > +} > + > +static inline void mmiowb_spin_lock(void) > +{ > + struct mmiowb_state *ms = __mmiowb_state(); > + ms->nesting_count++; > +} > + > +static inline void mmiowb_spin_unlock(void) > +{ > + struct mmiowb_state *ms = __mmiowb_state(); > + > + if (unlikely(ms->mmiowb_pending)) { > + ms->mmiowb_pending = 0; > + mmiowb(); > + } > + > + ms->nesting_count--; > +} Humour me for a minute and tell me what this algorithm is doing, or what was broken about the powerpc one, which is basically: static inline void mmiowb_set_pending(void) { struct mmiowb_state *ms = __mmiowb_state(); ms->mmiowb_pending = 1; } static inline void mmiowb_spin_lock(void) { } static inline void mmiowb_spin_unlock(void) { struct mmiowb_state *ms = __mmiowb_state(); if (unlikely(ms->mmiowb_pending)) { ms->mmiowb_pending = 0; mmiowb(); } } > diff --git a/include/asm-generic/mmiowb_types.h b/include/asm-generic/mmiowb_types.h > new file mode 100644 > index 000000000000..8eb0095655e7 > --- /dev/null > +++ b/include/asm-generic/mmiowb_types.h > @@ -0,0 +1,12 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > +#ifndef __ASM_GENERIC_MMIOWB_TYPES_H > +#define __ASM_GENERIC_MMIOWB_TYPES_H > + > +#include <linux/types.h> > + > +struct mmiowb_state { > + u16 nesting_count; > + u16 mmiowb_pending; > +}; Really need more than 255 nested spin locks? I had the idea that 16 bit operations were a bit more costly than 8 bit on some CPUs... may not be true, but at least the smaller size packs a bit better on powerpc. Thanks, Nick