Re: [PATCH] locking/rwlocks: do not starve writers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 17 Jun 2022 14:57:55 -0400 Waiman Long wrote:
>On 6/17/22 13:45, Eric Dumazet wrote:
>> On Fri, Jun 17, 2022 at 7:42 PM Waiman Long <longman@xxxxxxxxxx> wrote:
>>> On 6/17/22 11:24, Eric Dumazet wrote:
>>>> On Fri, Jun 17, 2022 at 5:00 PM Waiman Long <longman@xxxxxxxxxx> wrote:
>>>>> On 6/17/22 10:57, Shakeel Butt wrote:
>>>>>> On Fri, Jun 17, 2022 at 7:43 AM Waiman Long <longman@xxxxxxxxxx> wrote:
>>>>>>> On 6/17/22 08:07, Peter Zijlstra wrote:
>>>>>>>> On Fri, Jun 17, 2022 at 02:10:39AM -0700, Eric Dumazet wrote:
>>>>>>>>> --- a/kernel/locking/qrwlock.c
>>>>>>>>> +++ b/kernel/locking/qrwlock.c
>>>>>>>>> @@ -23,16 +23,6 @@ void queued_read_lock_slowpath(struct qrwlock *lock)
>>>>>>>>>         /*
>>>>>>>>>          * Readers come here when they cannot get the lock without waiting
>>>>>>>>>          */
>>>>>>>>> -    if (unlikely(in_interrupt())) {
>>>>>>>>> -            /*
>>>>>>>>> -             * Readers in interrupt context will get the lock immediately
>>>>>>>>> -             * if the writer is just waiting (not holding the lock yet),
>>>>>>>>> -             * so spin with ACQUIRE semantics until the lock is available
>>>>>>>>> -             * without waiting in the queue.
>>>>>>>>> -             */
>>>>>>>>> -            atomic_cond_read_acquire(&lock->cnts, !(VAL & _QW_LOCKED));
>>>>>>>>> -            return;
>>>>>>>>> -    }
>>>>>>>>>         atomic_sub(_QR_BIAS, &lock->cnts);
>>>>>>>>>
>>>>>>>>>         trace_contention_begin(lock, LCB_F_SPIN | LCB_F_READ);
>>>>>>>> This is known to break tasklist_lock.
>>>>>>>>
>>>>>>> We certainly can't break the current usage of tasklist_lock.
>>>>>>>
>>>>>>> I am aware of this problem with networking code and is thinking about
>>>>>>> either relaxing the check to exclude softirq or provide a
>>>>>>> read_lock_unfair() variant for networking use.
>>>>>> read_lock_unfair() for networking use or tasklist_lock use?
>>>>> I mean to say read_lock_fair(), but it could also be the other way
>>>>> around. Thanks for spotting that.
>>>>>
>>>> If only tasklist_lock is problematic and needs the unfair variant,
>>>> then changing a few read_lock() for tasklist_lock will be less
>>>> invasive than ~1000 read_lock() elsewhere....
>>> After a second thought, I think the right way is to introduce a fair
>>> variant, if needed. If an arch isn't using qrwlock, the native rwlock
>>> implementation will be unfair. In that sense, unfair rwlock is the
>>> default. We will only need to change the relevant network read_lock()
>>> calls to use the fair variant which will still be unfair if qrwlock
>>> isn't used. We are not going to touch other read_lock call that don't
>>> care about fair or unfair.
>>>
>> Hmm... backporting this kind of invasive change to stable kernels will
>> be a daunting task.
>>
>> Were rwlocks always unfair, and we have been lucky ?
>>
> Yes, rwlocks was always unfair and it always had this kind of soft 
> lockup problem and scalability problem because of cacheline bouncing. 
> That was reason of creating qrwlock which can at least provide a fair 
> rwlock at task context. Now we have systems with more and more cpus and 
> that is the reason why you are seeing it all over again with the 
> networking code.

No fair play without paying the price.

If writer wants to play fair game with readers, it has to wait in queue
for at least a tick then forces readers to go the slow path by setting
_QW_WAITING.

Only for thoughts now.

Hillf

+++ b/kernel/locking/qrwlock.c
@@ -22,16 +22,20 @@ void queued_read_lock_slowpath(struct qr
 	/*
 	 * Readers come here when they cannot get the lock without waiting
 	 */
+	if (_QW_WAITING & atomic_read(&lock->cnts))
+		goto queue;
+
 	if (unlikely(in_interrupt())) {
 		/*
 		 * Readers in interrupt context will get the lock immediately
-		 * if the writer is just waiting (not holding the lock yet),
+		 * if the writer is not waiting long enough,
 		 * so spin with ACQUIRE semantics until the lock is available
 		 * without waiting in the queue.
 		 */
 		atomic_cond_read_acquire(&lock->cnts, !(VAL & _QW_LOCKED));
 		return;
 	}
+queue:
 	atomic_sub(_QR_BIAS, &lock->cnts);
 
 	/*
@@ -60,6 +64,8 @@ EXPORT_SYMBOL(queued_read_lock_slowpath)
  */
 void queued_write_lock_slowpath(struct qrwlock *lock)
 {
+	unsigned long start;
+	int qw;
 	int cnts;
 
 	/* Put the writer into the wait queue */
@@ -70,12 +76,20 @@ void queued_write_lock_slowpath(struct q
 	    atomic_try_cmpxchg_acquire(&lock->cnts, &cnts, _QW_LOCKED))
 		goto unlock;
 
-	/* Set the waiting flag to notify readers that a writer is pending */
-	atomic_or(_QW_WAITING, &lock->cnts);
-
+	start = jiffies;
+	qw = 0;
 	/* When no more readers or writers, set the locked flag */
 	do {
-		cnts = atomic_cond_read_relaxed(&lock->cnts, VAL == _QW_WAITING);
+		if (qw == 0 && start + 2 < jiffies) {
+			qw = _QW_WAITING;
+			/*
+			 * Set the waiting flag to notify readers that a writer is pending
+			 * only after waiting long enough - that is the price writer pays
+			 * for fairness
+			 */
+			atomic_or(_QW_WAITING, &lock->cnts);
+		}
+		cnts = atomic_cond_read_relaxed(&lock->cnts, VAL == qw);
 	} while (!atomic_try_cmpxchg_acquire(&lock->cnts, &cnts, _QW_LOCKED));
 unlock:
 	arch_spin_unlock(&lock->wait_lock);




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux