On Tue 30-07-19 13:24:56, Thomas Gleixner wrote: > Bit spinlocks are problematic if PREEMPT_RT is enabled. They disable > preemption, which is undesired for latency reasons and breaks when regular > spinlocks are taken within the bit_spinlock locked region because regular > spinlocks are converted to 'sleeping spinlocks' on RT. > > Substitute the BH_State and BH_JournalHead bit spinlocks with regular > spinlock for PREEMPT_RT enabled kernels. Is there a real need for substitution for BH_JournalHead bit spinlock? The critical sections are pretty tiny, all located within fs/jbd2/journal.c. Maybe only the one around __journal_remove_journal_head() would need a bit of refactoring so that journal_free_journal_head() doesn't get called under the bit-spinlock. BH_State lock is definitely worth it. In fact, if you placed the spinlock inside struct journal_head (which is the structure whose members are in fact protected by it), I'd be even fine with just using the spinlock always instead of the bit spinlock. journal_head is pretty big anyway (and there's even 4-byte hole in it for 64-bit archs) and these structures are pretty rare (only for actively changed metadata buffers). Honza > > Bit spinlocks are also not covered by lock debugging, e.g. lockdep. With > the spinlock substitution in place, they can be exposed via > CONFIG_DEBUG_BIT_SPINLOCKS. > > Originally-by: Steven Rostedt <rostedt@xxxxxxxxxxx> > Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx> > Cc: linux-ext4@xxxxxxxxxxxxxxx > Cc: "Theodore Ts'o" <tytso@xxxxxxx> > Cc: Matthew Wilcox <willy@xxxxxxxxxxxxx> > Cc: Jan Kara <jack@xxxxxxxx> > -- > include/linux/buffer_head.h | 8 ++++++++ > include/linux/jbd2.h | 36 ++++++++++++++++++++++++++++++++++++ > 2 files changed, 44 insertions(+) > > --- a/include/linux/buffer_head.h > +++ b/include/linux/buffer_head.h > @@ -79,6 +79,10 @@ struct buffer_head { > > #if defined(CONFIG_PREEMPT_RT) || defined(CONFIG_DEBUG_BIT_SPINLOCKS) > spinlock_t b_uptodate_lock; > +# if IS_ENABLED(CONFIG_JBD2) > + spinlock_t b_state_lock; > + spinlock_t b_journal_head_lock; > +# endif > #endif > }; > > @@ -101,6 +105,10 @@ bh_uptodate_unlock_irqrestore(struct buf > static inline void buffer_head_init_locks(struct buffer_head *bh) > { > spin_lock_init(&bh->b_uptodate_lock); > +#if IS_ENABLED(CONFIG_JBD2) > + spin_lock_init(&bh->b_state_lock); > + spin_lock_init(&bh->b_journal_head_lock); > +#endif > } > > #else /* PREEMPT_RT || DEBUG_BIT_SPINLOCKS */ > --- a/include/linux/jbd2.h > +++ b/include/linux/jbd2.h > @@ -342,6 +342,40 @@ static inline struct journal_head *bh2jh > return bh->b_private; > } > > +#if defined(CONFIG_PREEMPT_RT) || defined(CONFIG_DEBUG_BIT_SPINLOCKS) > + > +static inline void jbd_lock_bh_state(struct buffer_head *bh) > +{ > + spin_lock(&bh->b_state_lock); > +} > + > +static inline int jbd_trylock_bh_state(struct buffer_head *bh) > +{ > + return spin_trylock(&bh->b_state_lock); > +} > + > +static inline int jbd_is_locked_bh_state(struct buffer_head *bh) > +{ > + return spin_is_locked(&bh->b_state_lock); > +} > + > +static inline void jbd_unlock_bh_state(struct buffer_head *bh) > +{ > + spin_unlock(&bh->b_state_lock); > +} > + > +static inline void jbd_lock_bh_journal_head(struct buffer_head *bh) > +{ > + spin_lock(&bh->b_journal_head_lock); > +} > + > +static inline void jbd_unlock_bh_journal_head(struct buffer_head *bh) > +{ > + spin_unlock(&bh->b_journal_head_lock); > +} > + > +#else /* PREEMPT_RT || DEBUG_BIT_SPINLOCKS */ > + > static inline void jbd_lock_bh_state(struct buffer_head *bh) > { > bit_spin_lock(BH_State, &bh->b_state); > @@ -372,6 +406,8 @@ static inline void jbd_unlock_bh_journal > bit_spin_unlock(BH_JournalHead, &bh->b_state); > } > > +#endif /* !PREEMPT_RT && !DEBUG_BIT_SPINLOCKS */ > + > #define J_ASSERT(assert) BUG_ON(!(assert)) > > #define J_ASSERT_BH(bh, expr) J_ASSERT(expr) > > > -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR