Re: [PATCH v2] bcache: fix set_at_max_writeback_rate() for multiple attached devices

Coly Li <colyli@xxxxxxx> · Mon, 19 Sep 2022 12:38:26 +0800

> 2022年9月19日 11:29，mingzhe <mingzhe.zou@xxxxxxxxxxxx> 写道：
> 
> 
> 
> 在 2022/9/18 20:16, Coly Li 写道:
>> Inside set_at_max_writeback_rate() the calculation in following if()
>> check is wrong,
>> 	if (atomic_inc_return(&c->idle_counter) <
>> 	    atomic_read(&c->attached_dev_nr) * 6)
>> Because each attached backing device has its own writeback thread
>> running and increasing c->idle_counter, the counter increates much
>> faster than expected. The correct calculation should be,
>> 	(counter / dev_nr) < dev_nr * 6
>> which equals to,
>> 	counter < dev_nr * dev_nr * 6
>> This patch fixes the above mistake with correct calculation, and helper
>> routine idle_counter_exceeded() is added to make code be more clear.
>> Reported-by: Mingzhe Zou <mingzhe.zou@xxxxxxxxxxxx>
>> Signed-off-by: Coly Li <colyli@xxxxxxx>
>> ---
>> Changelog:
>> v2: Add the missing "!atomic_read(&c->at_max_writeback_rate)" part
>>     back.
>> v1: Original verison.
>>  drivers/md/bcache/writeback.c | 73 +++++++++++++++++++++++++----------
>>  1 file changed, 52 insertions(+), 21 deletions(-)
>> diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c
>> index 647661005176..c186bf55fe61 100644
>> --- a/drivers/md/bcache/writeback.c
>> +++ b/drivers/md/bcache/writeback.c
>> @@ -157,6 +157,53 @@ static void __update_writeback_rate(struct cached_dev *dc)
>>  	dc->writeback_rate_target = target;
>>  }
>>  +static bool idle_counter_exceeded(struct cache_set *c)
>> +{
>> +	int counter, dev_nr;
>> +
>> +	/*
>> +	 * If c->idle_counter is overflow (idel for really long time),
>> +	 * reset as 0 and not set maximum rate this time for code
>> +	 * simplicity.
>> +	 */
>> +	counter = atomic_inc_return(&c->idle_counter);
>> +	if (counter <= 0) {
>> +		atomic_set(&c->idle_counter, 0);
>> +		return false;
>> +	}
>> +
>> +	dev_nr = atomic_read(&c->attached_dev_nr);
>> +	if (dev_nr == 0)
>> +		return false;
>> +
>> +	/*
>> +	 * c->idle_counter is increased by writeback thread of all
>> +	 * attached backing devices, in order to represent a rough
>> +	 * time period, counter should be divided by dev_nr.
>> +	 * Otherwise the idle time cannot be larger with more backing
>> +	 * device attached.
>> +	 * The following calculation equals to checking
>> +	 *	(counter / dev_nr) < (dev_nr * 6)
>> +	 */
>> +	if (counter < (dev_nr * dev_nr * 6))
>> +		return false;
> Hi, Coly
> 
> Look good to me. However, do we need to specify a maximum value for idle_counter. If cache_set has 100 backing devices, there are dev_nr*6 rounds of update_writeback_rate() on each backing device, which takes dc->writeback_rate_update_seconds(default is 5 seconds)*600 seconds.

Yes, this is exactly what I intend to do. The maximum writeback rate is only set when the whole cahce set is really really idle.

> 
> In fact, any io request from the backing device will clear the idle_counter of cache_set, and at_max_writeback_rate will exit soon. Therefore, if cache_set waits for 5 minutes or 10 minutes without any io request to start at_max_writeback_rate, it will not have any effect on the possible front-end io.
> 
> In addition, cache_set only waits 6 rounds of update_writeback_rate() should not have much performance impact.

The maximum writeback rate is a better than nothing effort, normally the P.I controller decides the writeback rate, and the maximum writeback rate trick should not step in.

BTW, if the patch looks fine to you, could you please to response a Reviewed-by or Acked-by for it?

Thanks.

Coly Li

> 
> mingzhe
>> +
>> +	return true;
>> +}
>> +
>> +/*
>> + * Idle_counter is increased everytime when update_writeback_rate() is
>> + * called. If all backing devices attached to the same cache set have
>> + * identical dc->writeback_rate_update_seconds values, it is about 6
>> + * rounds of update_writeback_rate() on each backing device before
>> + * c->at_max_writeback_rate is set to 1, and then max wrteback rate set
>> + * to each dc->writeback_rate.rate.
>> + * In order to avoid extra locking cost for counting exact dirty cached
>> + * devices number, c->attached_dev_nr is used to calculate the idle
>> + * throushold. It might be bigger if not all cached device are in write-
>> + * back mode, but it still works well with limited extra rounds of
>> + * update_writeback_rate().
>> + */
>>  static bool set_at_max_writeback_rate(struct cache_set *c,
>>  				       struct cached_dev *dc)
>>  {
>> @@ -167,21 +214,8 @@ static bool set_at_max_writeback_rate(struct cache_set *c,
>>  	/* Don't set max writeback rate if gc is running */
>>  	if (!c->gc_mark_valid)
>>  		return false;
>> -	/*
>> -	 * Idle_counter is increased everytime when update_writeback_rate() is
>> -	 * called. If all backing devices attached to the same cache set have
>> -	 * identical dc->writeback_rate_update_seconds values, it is about 6
>> -	 * rounds of update_writeback_rate() on each backing device before
>> -	 * c->at_max_writeback_rate is set to 1, and then max wrteback rate set
>> -	 * to each dc->writeback_rate.rate.
>> -	 * In order to avoid extra locking cost for counting exact dirty cached
>> -	 * devices number, c->attached_dev_nr is used to calculate the idle
>> -	 * throushold. It might be bigger if not all cached device are in write-
>> -	 * back mode, but it still works well with limited extra rounds of
>> -	 * update_writeback_rate().
>> -	 */
>> -	if (atomic_inc_return(&c->idle_counter) <
>> -	    atomic_read(&c->attached_dev_nr) * 6)
>> +
>> +	if (!idle_counter_exceeded(c))
>>  		return false;
>>    	if (atomic_read(&c->at_max_writeback_rate) != 1)
>> @@ -195,13 +229,10 @@ static bool set_at_max_writeback_rate(struct cache_set *c,
>>  	dc->writeback_rate_change = 0;
>>    	/*
>> -	 * Check c->idle_counter and c->at_max_writeback_rate agagain in case
>> -	 * new I/O arrives during before set_at_max_writeback_rate() returns.
>> -	 * Then the writeback rate is set to 1, and its new value should be
>> -	 * decided via __update_writeback_rate().
>> +	 * In case new I/O arrives during before
>> +	 * set_at_max_writeback_rate() returns.
>>  	 */
>> -	if ((atomic_read(&c->idle_counter) <
>> -	     atomic_read(&c->attached_dev_nr) * 6) ||
>> +	if (!idle_counter_exceeded(c) ||
>>  	    !atomic_read(&c->at_max_writeback_rate))
>>  		return false;
>>