On Thu, Sep 07, 2023 at 07:47:31PM -0400, Waiman Long wrote: > > On 9/7/23 17:06, Waiman Long wrote: > > > > On 9/7/23 15:33, Matthew Wilcox wrote: > > > On Thu, Sep 07, 2023 at 02:05:54PM -0400, Waiman Long wrote: > > > > On 9/7/23 13:47, Matthew Wilcox (Oracle) wrote: > > > > > +static inline int rwsem_is_write_locked(struct rw_semaphore *sem) > > > > > +{ > > > > > + return atomic_long_read(&sem->count) & 1 /* > > > > > RWSEM_WRITER_LOCKED */; > > > > > +} > > > > I would prefer you move the various RWSEM_* count bit macros from > > > > kernel/locking/rwsem.c to under the !PREEMPT_RT block and directly use > > > > RWSEM_WRITER_LOCKED instead of hardcoding a value of 1. > > > Just to be clear, you want the ~50 lines from: > > > > > > /* > > > * On 64-bit architectures, the bit definitions of the count are: > > > ... > > > #define RWSEM_READ_FAILED_MASK (RWSEM_WRITER_MASK|RWSEM_FLAG_WAITERS|\ > > > RWSEM_FLAG_HANDOFF|RWSEM_FLAG_READFAIL) > > > > > > moved from rwsem.c to rwsem.h? > > > > > > Or just these four lines: > > > > > > #define RWSEM_WRITER_LOCKED (1UL << 0) > > > #define RWSEM_FLAG_WAITERS (1UL << 1) > > > #define RWSEM_FLAG_HANDOFF (1UL << 2) > > > #define RWSEM_FLAG_READFAIL (1UL << (BITS_PER_LONG - 1)) > > > > I think just the first 3 lines will be enough. Maybe a bit of comment > > about these bit flags in the count atomic_long value. > > Actually, the old rwsem implementation won't allow you to reliably determine > if a rwsem is write locked because the xadd instruction is used for write > locking and the code had to back out the WRITER_BIAS if the attempt failed. > Maybe that is why XFS has its own code to check if a rwsem is write locked > which is needed with the old rwsem implementation. mrlocks pre-date rwsems entirely on Linux. mrlocks were introduced to XFS as part of the port from Irix back in 2000. This originally had a 'ismrlocked()' function for checking lock state. In 2003, this was expanded to allow explicit lock type checks via 'mrislocked_access() and 'mrislocked_update()' wrappers that checked internal counters to determine how it was locked. In 2004, the mrlock was converted to use the generic kernel rwsem implementation, and because that couldn't be used to track writers, the mrlock included a mr_writer boolean field to indicate it was write locked for the purpose of implementing the existing debug checks. Hence the mrlock debug code has always had reliable differentiation of read vs write state, whereas we couldn't do that natively with rwsems for a real long time. The mrlocks have essentially remained unchanged since 2004 - this long predates lockdep, and it lives on because gives us something lockdep doesn't: zero overhead locking validation. > The new implementation makes this check reliable. Still it is not easy to > check if a rwsem is read locked as the check will be rather complicated and > probably racy. You can't look at these locking checks in isolation. These checks are done in code paths where we expect the caller to have already locked the rwsem in the manner required. Hence there should be no races with rwsem state changes at all. If we see a locking assert to fire, it means we either screwed up the XFS locking completely (no races necessary), or there's a bug in the rwsem implementation. The latter case has occurred several times; the rwsem locking checks in XFS have uncovered more than one rwsem implementation bug in the past... IOWs, the explicit lock state checks and asserts provide a lock implemetnation validation mechanism that lockdep doesn't.... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx