On 13 Oct 2022 21:15:05 +0800 Hillf Danton <hdanton@xxxxxxxx> > On 13 Oct 2022 12:02:09 +0200 Peter Zijlstra <peterz@xxxxxxxxxxxxx> > > On Wed, Oct 12, 2022 at 09:33:32AM -0400, Waiman Long wrote: > > > It is especially problematic if the non-first waiter is an RT task and > > > it is running on the same CPU as the first waiter as this can lead to > > > live lock. > > > > > So why not do a better handoff? Specifically, have the owner set owner > > to first-waiter instead of NULL ? (like the normal mutex code) > > Given a simple coding of "better handoff", with care to avoid change added > to fast path, I see no bonus except for preventing non-first waiters from > spinning. Spin with a line in fast path cut. Prevent non-first waiters from spinning too much by 1) not clearing lock owner before releasing rwsem because new acquirer will set it and 2) setting owner to the first waiter instead while waking lock waiters up in bid to force non-first spinners to take a break. Note, first waiter spinning longer than thought still makes trouble but it is not addressed here because of known cure. Hillf +++ b/kernel/locking/rwsem.c @@ -429,6 +429,7 @@ static void rwsem_mark_wake(struct rw_se if (waiter->type == RWSEM_WAITING_FOR_WRITE) { if (wake_type == RWSEM_WAKE_ANY) { + atomic_long_set(&sem->owner, (long)waiter->task); /* * Mark writer at the front of the queue for wakeup. * Until the task is actually later awoken later by @@ -752,7 +753,7 @@ rwsem_owner_state(struct task_struct *ow static noinline enum owner_state rwsem_spin_on_owner(struct rw_semaphore *sem) { - struct task_struct *new, *owner; + struct task_struct *new, *owner, *me = current; unsigned long flags, new_flags; enum owner_state state; @@ -762,6 +763,8 @@ rwsem_spin_on_owner(struct rw_semaphore state = rwsem_owner_state(owner, flags); if (state != OWNER_WRITER) return state; + if (owner == me) + return OWNER_NULL; for (;;) { /* @@ -772,7 +775,10 @@ rwsem_spin_on_owner(struct rw_semaphore */ new = rwsem_owner_flags(sem, &new_flags); if ((new != owner) || (new_flags != flags)) { - state = rwsem_owner_state(new, new_flags); + if (new == me) + state = OWNER_NULL; + else + state = rwsem_owner_state(new, new_flags); break; } @@ -1360,10 +1366,7 @@ static inline void __up_write(struct rw_ DEBUG_RWSEMS_WARN_ON((rwsem_owner(sem) != current) && !rwsem_test_oflags(sem, RWSEM_NONSPINNABLE), sem); - preempt_disable(); - rwsem_clear_owner(sem); tmp = atomic_long_fetch_add_release(-RWSEM_WRITER_LOCKED, &sem->count); - preempt_enable(); if (unlikely(tmp & RWSEM_FLAG_WAITERS)) rwsem_wake(sem); }