On 10/11/22 09:16, Mukesh Ojha wrote:
Hi @Hilf,
Thanks for looking into this issue.
On 10/11/2022 4:16 PM, Hillf Danton wrote:
On 10/10/22 06:24 Mukesh Ojha <quic_mojha@xxxxxxxxxxx>
Hi Waiman,
On 9/29/2022 11:36 PM, Waiman Long wrote:
On 9/29/22 14:04, Waiman Long wrote:
A non-first waiter can potentially spin in the for loop of
rwsem_down_write_slowpath() without sleeping but fail to acquire the
lock even if the rwsem is free if the following sequence happens:
Non-first waiter First waiter Lock holder
---------------- ------------ -----------
Acquire wait_lock
rwsem_try_write_lock():
Set handoff bit if RT or
wait too long
Set waiter->handoff_set
Release wait_lock
Acquire wait_lock
Inherit waiter->handoff_set
Release wait_lock
Clear owner
Release lock
if (waiter.handoff_set) {
rwsem_spin_on_owner(();
if (OWNER_NULL)
goto trylock_again;
}
trylock_again:
Acquire wait_lock
rwsem_try_write_lock():
if (first->handoff_set && (waiter != first))
return false;
Release wait_lock
It is especially problematic if the non-first waiter is an RT task
and
it is running on the same CPU as the first waiter as this can lead to
live lock.
Fixes: d257cc8cb8d5 ("locking/rwsem: Make handoff bit handling more
consistent")
Signed-off-by: Waiman Long <longman@xxxxxxxxxx>
---
kernel/locking/rwsem.c | 13 ++++++++++---
1 file changed, 10 insertions(+), 3 deletions(-)
Mukesh, can you test if this patch can fix the RT task lockup problem?
Looks like, There is still a window for a race.
There is a chance when a reader who came first added it's BIAS and
goes to slowpath and before it gets added to wait list it got
preempted by RT task which goes to slowpath as well and being the
first waiter gets its hand-off bit set and not able to get the lock
due to following condition in rwsem_try_write_lock()
[]
630 if (count & RWSEM_LOCK_MASK) { ==> reader has
sets its bias
..
...
634
635 new |= RWSEM_FLAG_HANDOFF;
636 } else {
637 new |= RWSEM_WRITER_LOCKED;
---------------------->----------------------->-------------------------
First reader (1) writer(2) RT task Lock holder(3)
It sets
RWSEM_READER_BIAS.
while it is going to
slowpath(as the lock
was held by (3)) and
before it got added
to the waiters list
it got preempted
by (2).
RT task also takes
the slowpath and add release the
itself into waiting list rwsem lock
and since it is the first clear the
it is the next one to get owner.
the lock but it can not
get the lock as (count &
RWSEM_LOCK_MASK) is set
as (1) has added it but
not able to remove its
adjustment.
[]
Hey Mukesh,
Can you test the diff if it makes sense to you?
It simply prevents the first waiter from spinning any longer after
detecting
it barely makes any progress to spin without lock owner.
Hillf
--- mainline/kernel/locking/rwsem.c
+++ b/kernel/locking/rwsem.c
@@ -611,26 +611,15 @@ static inline bool rwsem_try_write_lock(
long count, new;
lockdep_assert_held(&sem->wait_lock);
+ waiter->handoff_set = false;
count = atomic_long_read(&sem->count);
do {
bool has_handoff = !!(count & RWSEM_FLAG_HANDOFF);
if (has_handoff) {
- /*
- * Honor handoff bit and yield only when the first
- * waiter is the one that set it. Otherwisee, we
- * still try to acquire the rwsem.
- */
- if (first->handoff_set && (waiter != first))
+ if (waiter != first)
return false;
you mean, you want to check and change waiter->handoff_set on every
run rwsem_try_write_lock().
But does it break optimistic spinning ? @waiman ?
As I said in my previous mail, this is equivalent to allow only one
optimistic spinning attempt after setting the handoff bit. That will
likely reduce the performance benefit provided by this feature.
Cheers,
Longman