On Wed, Dec 06, 2023 at 11:16:50PM -0500, Kent Overstreet wrote: > On Wed, Dec 06, 2023 at 05:05:39PM +1100, Dave Chinner wrote: > > From: Dave Chinner <dchinner@xxxxxxxxxx> > > > > hash-bl nests spinlocks inside the bit locks. This causes problems > > for CONFIG_PREEMPT_RT which converts spin locks to sleeping locks, > > and we're not allowed to sleep while holding a spinning lock. > > > > Further, lockdep does not support bit locks, so we lose lockdep > > coverage of the inode hash table with the hash-bl conversion. > > > > To enable these configs to work, add an external per-chain spinlock > > to the hlist_bl_head() and add helpers to use this instead of the > > bit spinlock when preempt_rt or lockdep are enabled. > > > > This converts all users of hlist-bl to use the external spinlock in > > these situations, so we also gain lockdep coverage of things like > > the dentry cache hash table with this change. > > > > Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx> > > Sleepable bit locks can be done with wait_on_bit(), is that worth > considering for PREEMPT_RT? Or are the other features of real locks > important there? I think wait_on_bit() is not scalable. It hashes down to one of 256 shared struct wait_queue_heads which have thundering herd behaviours, and it requires the locker to always run prepare_to_wait() and finish_wait(). This means there is at least one spinlock_irqsave()/unlock pair needed, sometimes two, just to get an uncontended sleeping bit lock. So as a fast path operation that requires lock scalability, it's going to be better to use a straight spinlock that doesn't require irq safety as it's far less expensive than a sleeping bit lock. Whether CONFIG_PREEMPT_RT changes that equation at all is not at all clear to me, and so I'll leave that consideration to RT people if they see a need to address it. In the mean time, we need to use an external spinlock for lockdep validation so it really doesn't make any sense at all to add a third locking variant with completely different semantics just for PREEMPT_RT... -Dave. -- Dave Chinner david@xxxxxxxxxxxxx