Re: [PATCH] blk-wbt: get back the missed wakeup from __wbt_done

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 8/23/18 8:37 AM, Jens Axboe wrote:
> On 8/23/18 7:08 AM, Jianchao Wang wrote:
>> 2887e41 (blk-wbt: Avoid lock contention and thundering herd
>> issue in wbt_wait) introduces two cases that could miss wakeup:
>>  - __wbt_done only wakes up one waiter one time. There could be
>>    multiple waiters and (limit - inflight) > 1 at the moment.
>>
>>  - When the waiter is waked up, it is still on wait queue and set
>>    to TASK_UNINTERRUPTIBLE immediately, so this waiter could be
>>    waked up one more time. If a __wbt_done comes and wakes up
>>    again, the prevous waiter may waste a wakeup.
>>
>> To fix them and avoid to introduce too much lock contention, we
>> introduce our own wake up func wbt_wake_function in __wbt_wait and
>> use wake_up_all in __wbt_done. wbt_wake_function will try to get
>> wbt budget firstly, if sucesses, wake up the process, otherwise,
>> return -1 to interrupt the wake up loop.
> I really like this approach, since it'll naturally wake up as many
> as we can instead of either being single wakeups, or wake all.
>
> BTW, you added Anchal and Frank to the CC in the patch, but they 
> are not actually CC'ed. Doing that now - can you guys give this
> a whirl? Should still solve the thundering herd issue, but be
> closer to the original logic in terms of wakeup. Actually better,
> since we remain on list and remain ordered.
>
> Leaving the patch below for you guys.
>
>> Signed-off-by: Jianchao Wang <jianchao.w.wang@xxxxxxxxxx>
>> Fixes: 2887e41 (blk-wbt: Avoid lock contention and thundering herd issue in wbt_wait)
>> Cc: Anchal Agarwal <anchalag@xxxxxxxxxx>
>> Cc: Frank van der Linden <fllinden@xxxxxxxxxx>
>> ---
>>  block/blk-wbt.c | 78 +++++++++++++++++++++++++++++++++++++++------------------
>>  1 file changed, 54 insertions(+), 24 deletions(-)
>>
>> diff --git a/block/blk-wbt.c b/block/blk-wbt.c
>> index c9358f1..2667590 100644
>> --- a/block/blk-wbt.c
>> +++ b/block/blk-wbt.c
>> @@ -166,7 +166,7 @@ static void __wbt_done(struct rq_qos *rqos, enum wbt_flags wb_acct)
>>  		int diff = limit - inflight;
>>  
>>  		if (!inflight || diff >= rwb->wb_background / 2)
>> -			wake_up(&rqw->wait);
>> +			wake_up_all(&rqw->wait);
>>  	}
>>  }
>>  
>> @@ -481,6 +481,40 @@ static inline unsigned int get_limit(struct rq_wb *rwb, unsigned long rw)
>>  	return limit;
>>  }
>>  
>> +struct wbt_wait_data {
>> +	struct task_struct *curr;
>> +	struct rq_wb *rwb;
>> +	struct rq_wait *rqw;
>> +	unsigned long rw;
>> +};
>> +
>> +static int wbt_wake_function(wait_queue_entry_t *curr, unsigned int mode,
>> +		int wake_flags, void *key)
>> +{
>> +	struct wbt_wait_data *data = curr->private;
>> +
>> +	/*
>> +	 * If fail to get budget, return -1 to interrupt the wake up
>> +	 * loop in __wake_up_common.
>> +	 */
>> +	if (!rq_wait_inc_below(data->rqw, get_limit(data->rwb, data->rw)))
>> +		return -1;
>> +
>> +	wake_up_process(data->curr);
>> +
>> +	list_del_init(&curr->entry);
>> +	return 1;
>> +}
>> +
>> +static inline void wbt_init_wait(struct wait_queue_entry *wait,
>> +		struct wbt_wait_data *data)
>> +{
>> +	INIT_LIST_HEAD(&wait->entry);
>> +	wait->flags = 0;
>> +	wait->func = wbt_wake_function;
>> +	wait->private = data;
>> +}
>> +
>>  /*
>>   * Block if we will exceed our limit, or if we are currently waiting for
>>   * the timer to kick off queuing again.
>> @@ -491,31 +525,27 @@ static void __wbt_wait(struct rq_wb *rwb, enum wbt_flags wb_acct,
>>  	__acquires(lock)
>>  {
>>  	struct rq_wait *rqw = get_rq_wait(rwb, wb_acct);
>> -	DECLARE_WAITQUEUE(wait, current);
>> -	bool has_sleeper;
>> -
>> -	has_sleeper = wq_has_sleeper(&rqw->wait);
>> -	if (!has_sleeper && rq_wait_inc_below(rqw, get_limit(rwb, rw)))
>> +	struct wait_queue_entry wait;
>> +	struct wbt_wait_data data = {
>> +		.curr = current,
>> +		.rwb = rwb,
>> +		.rqw = rqw,
>> +		.rw = rw,
>> +	};
>> +
>> +	if (!wq_has_sleeper(&rqw->wait) &&
>> +			rq_wait_inc_below(rqw, get_limit(rwb, rw)))
>>  		return;
>>  
>> -	add_wait_queue_exclusive(&rqw->wait, &wait);
>> -	do {
>> -		set_current_state(TASK_UNINTERRUPTIBLE);
>> -
>> -		if (!has_sleeper && rq_wait_inc_below(rqw, get_limit(rwb, rw)))
>> -			break;
>> -
>> -		if (lock) {
>> -			spin_unlock_irq(lock);
>> -			io_schedule();
>> -			spin_lock_irq(lock);
>> -		} else
>> -			io_schedule();
>> -		has_sleeper = false;
>> -	} while (1);
>> -
>> -	__set_current_state(TASK_RUNNING);
>> -	remove_wait_queue(&rqw->wait, &wait);
>> +	wbt_init_wait(&wait, &data);
>> +	prepare_to_wait_exclusive(&rqw->wait, &wait,
>> +			TASK_UNINTERRUPTIBLE);
>> +	if (lock) {
>> +		spin_unlock_irq(lock);
>> +		io_schedule();
>> +		spin_lock_irq(lock);
>> +	} else
>> +		io_schedule();
>>  }
>>  
>>  static inline bool wbt_should_throttle(struct rq_wb *rwb, struct bio *bio)
>>
>
I like it, this combines some of the ideas outlined in our previous
thread in to a good solution.

One comment, though: I think we need to set the task state back to
TASK_RUNNING after being woken up, right?

Frank





[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux