Re: [PATCH] md/bcache: Fix a deadlock while calculating writeback rate

Eric Wheeler <bcache@xxxxxxxxxxxxxxxxxx> · Tue, 9 May 2017 18:53:16 +0000 (UTC)

On Sun, 7 May 2017, Phillipp Röll wrote:
> A bcache foreground write holds a read lock on the writeback_lock
> rwsem. In order to complete and release the lock after the actual
> write, it needs to update the index, which runs out of a workqueue.
> 
> If the bch_writeback_thread is starting to write back dirty data while
> the foreground write still holds the lock, it tries to get a write
> lock on the writeback lock, thus blocking until the foreground write
> completes.
> 
> If at this moment a workqueue worker starts an update of the write-
> back rate, while foreground write completions are queued behind it,
> the update thread will try to get a read lock, which blocks, because
> the writeback thread is trying to get a write lock. This blocks the
> foreground writes behind it and locks up all writeback and foreground
> writes.
> 
> Removing the read lock from the writeback rate update thread is the
> simplest solution: the worst thing that will happen when the scan
> for dirty data is inconsistent is a wrong P-factor in the PID controller.

Hi Phillipp,

Thank you for tracking this down!  Did you have an actual deadlock case 
that this fixed?

I wonder: in the event of a race, can the updated P-factor be invalid (ie, 
partially populated)? I don't believe 64-bit assignments are atomic on all 
architectures (see __update_writeback_rate) so you might get the upper or 
lower 32 bits of the P-factor or of other values assigned therein.  Could 
the rate be calculated in such a race by the writeback thread such that no 
progress is made (eg, effectively rate==0)?

I certainly want to see deadlocks fixed and thought this could be a 
consideration.  Worst case we could put a spinlock around the use of the 
dc->writeback_rate* fields.

What do you think?

--
Eric Wheeler

> 
> Signed-off-by: Phillipp Röll <phillipp.roell@xxxxxxxxxxxxxx>
> ---
>  drivers/md/bcache/writeback.c | 4 ----
>  1 file changed, 4 deletions(-)
> 
> diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c
> index 6ac2e48..90da3bd 100644
> --- a/drivers/md/bcache/writeback.c
> +++ b/drivers/md/bcache/writeback.c
> @@ -75,14 +75,10 @@ static void update_writeback_rate(struct work_struct *work)
>  					     struct cached_dev,
>  					     writeback_rate_update);
>  
> -	down_read(&dc->writeback_lock);
> -
>  	if (atomic_read(&dc->has_dirty) &&
>  	    dc->writeback_percent)
>  		__update_writeback_rate(dc);
>  
> -	up_read(&dc->writeback_lock);
> -
>  	schedule_delayed_work(&dc->writeback_rate_update,
>  			      dc->writeback_rate_update_seconds * HZ);
>  }
> -- 
> 2.10.0
> 
> 
>