Re: [PATCH v5 03/10] bcache: quit dc->writeback_thread when BCACHE_DEV_DETACHING is set

Michael Lyle <mlyle@xxxxxxxx> · Wed, 7 Feb 2018 09:35:19 -0800



Reviewed-by: Michael Lyle <mlyle@xxxxxxxx>

On 02/05/2018 08:20 PM, Coly Li wrote:
> In patch "bcache: fix cached_dev->count usage for bch_cache_set_error()",
> cached_dev_get() is called when creating dc->writeback_thread, and
> cached_dev_put() is called when exiting dc->writeback_thread. This
> modification works well unless people detach the bcache device manually by
>     'echo 1 > /sys/block/bcache<N>/bcache/detach'
> Because this sysfs interface only calls bch_cached_dev_detach() which wakes
> up dc->writeback_thread but does not stop it. The reason is, before patch
> "bcache: fix cached_dev->count usage for bch_cache_set_error()", inside
> bch_writeback_thread(), if cache is not dirty after writeback,
> cached_dev_put() will be called here. And in cached_dev_make_request() when
> a new write request makes cache from clean to dirty, cached_dev_get() will
> be called there. Since we don't operate dc->count in these locations,
> refcount d->count cannot be dropped after cache becomes clean, and
> cached_dev_detach_finish() won't be called to detach bcache device.
> 
> This patch fixes the issue by checking whether BCACHE_DEV_DETACHING is
> set inside bch_writeback_thread(). If this bit is set and cache is clean
> (no existing writeback_keys), break the while-loop, call cached_dev_put()
> and quit the writeback thread.
> 
> Please note if cache is still dirty, even BCACHE_DEV_DETACHING is set the
> writeback thread should continue to perform writeback, this is the original
> design of manually detach.
> 
> It is safe to do the following check without locking, let me explain why,
> +	if (!test_bit(BCACHE_DEV_DETACHING, &dc->disk.flags) &&
> +	    (!atomic_read(&dc->has_dirty) || !dc->writeback_running)) {
> 
> If the kenrel thread does not sleep and continue to run due to conditions
> are not updated in time on the running CPU core, it just consumes more CPU
> cycles and has no hurt. This should-sleep-but-run is safe here. We just
> focus on the should-run-but-sleep condition, which means the writeback
> thread goes to sleep in mistake while it should continue to run.
> 1, First of all, no matter the writeback thread is hung or not, kthread_stop() from
>    cached_dev_detach_finish() will wake up it and terminate by making
>    kthread_should_stop() return true. And in normal run time, bit on index
>    BCACHE_DEV_DETACHING is always cleared, the condition
> 	!test_bit(BCACHE_DEV_DETACHING, &dc->disk.flags)
>    is always true and can be ignored as constant value.
> 2, If one of the following conditions is true, the writeback thread should
>    go to sleep,
>    "!atomic_read(&dc->has_dirty)" or "!dc->writeback_running)"
>    each of them independently controls the writeback thread should sleep or
>    not, let's analyse them one by one.
> 2.1 condition "!atomic_read(&dc->has_dirty)"
>    If dc->has_dirty is set from 0 to 1 on another CPU core, bcache will
>    call bch_writeback_queue() immediately or call bch_writeback_add() which
>    indirectly calls bch_writeback_queue() too. In bch_writeback_queue(),
>    wake_up_process(dc->writeback_thread) is called. It sets writeback
>    thread's task state to TASK_RUNNING and following an implicit memory
>    barrier, then tries to wake up the writeback thread.
>    In writeback thread, its task state is set to TASK_INTERRUPTIBLE before
>    doing the condition check. If other CPU core sets the TASK_RUNNING state
>    after writeback thread setting TASK_INTERRUPTIBLE, the writeback thread
>    will be scheduled to run very soon because its state is not
>    TASK_INTERRUPTIBLE. If other CPU core sets the TASK_RUNNING state before
>    writeback thread setting TASK_INTERRUPTIBLE, the implict memory barrier
>    of wake_up_process() will make sure modification of dc->has_dirty on
>    other CPU core is updated and observed on the CPU core of writeback
>    thread. Therefore the condition check will correctly be false, and
>    continue writeback code without sleeping.
> 2.2 condition "!dc->writeback_running)"
>    dc->writeback_running can be changed via sysfs file, every time it is
>    modified, a following bch_writeback_queue() is alwasy called. So the
>    change is always observed on the CPU core of writeback thread. If
>    dc->writeback_running is changed from 0 to 1 on other CPU core, this
>    condition check will observe the modification and allow writeback
>    thread to continue to run without sleeping.
> Now we can see, even without a locking protection, multiple conditions
> check is safe here, no deadlock or process hang up will happen.
> 
> I compose a separte patch because that patch "bcache: fix cached_dev->count
> usage for bch_cache_set_error()" already gets a "Reviewed-by:" from Hannes
> Reinecke. Also this fix is not trivial and good for a separate patch.
> 
> Signed-off-by: Coly Li <colyli@xxxxxxx>
> Cc: Michael Lyle <mlyle@xxxxxxxx>
> Cc: Hannes Reinecke <hare@xxxxxxxx>
> Cc: Huijun Tang <tang.junhui@xxxxxxxxxx>
> ---
>  drivers/md/bcache/writeback.c | 20 +++++++++++++++++---
>  1 file changed, 17 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c
> index b280c134dd4d..4dbeaaa575bf 100644
> --- a/drivers/md/bcache/writeback.c
> +++ b/drivers/md/bcache/writeback.c
> @@ -565,9 +565,15 @@ static int bch_writeback_thread(void *arg)
>  	while (!kthread_should_stop()) {
>  		down_write(&dc->writeback_lock);
>  		set_current_state(TASK_INTERRUPTIBLE);
> -		if (!atomic_read(&dc->has_dirty) ||
> -		    (!test_bit(BCACHE_DEV_DETACHING, &dc->disk.flags) &&
> -		     !dc->writeback_running)) {
> +		/*
> +		 * If the bache device is detaching, skip here and continue
> +		 * to perform writeback. Otherwise, if no dirty data on cache,
> +		 * or there is dirty data on cache but writeback is disabled,
> +		 * the writeback thread should sleep here and wait for others
> +		 * to wake up it.
> +		 */
> +		if (!test_bit(BCACHE_DEV_DETACHING, &dc->disk.flags) &&
> +		    (!atomic_read(&dc->has_dirty) || !dc->writeback_running)) {
>  			up_write(&dc->writeback_lock);
>  
>  			if (kthread_should_stop()) {
> @@ -587,6 +593,14 @@ static int bch_writeback_thread(void *arg)
>  			atomic_set(&dc->has_dirty, 0);
>  			SET_BDEV_STATE(&dc->sb, BDEV_STATE_CLEAN);
>  			bch_write_bdev_super(dc, NULL);
> +			/*
> +			 * If bcache device is detaching via sysfs interface,
> +			 * writeback thread should stop after there is no dirty
> +			 * data on cache. BCACHE_DEV_DETACHING flag is set in
> +			 * bch_cached_dev_detach().
> +			 */
> +			if (test_bit(BCACHE_DEV_DETACHING, &dc->disk.flags))
> +				break;
>  		}
>  
>  		up_write(&dc->writeback_lock);
>