Re: [PATCH] drbd: fix potential silent data corruption

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Apr 26, 2021 at 06:17:08PM +0200, Christoph Böhmwalder wrote:
> From: Lars Ellenberg <lars.ellenberg@xxxxxxxxxx>
> 
> Scenario:
> ---------
> 
> bio chain generated by blk_queue_split().
> Some split bio fails and propagates its error status to the "parent" bio.
> But then the (last part of the) parent bio itself completes without error.
> 
> We would clobber the already recorded error status with BLK_STS_OK,
> causing silent data corruption.
> 
> Reproducer:
> -----------
> 
> How to trigger this in the real world within seconds:
> 
> DRBD on top of degraded parity raid,
> small stripe_cache_size, large read_ahead setting.
> Drop page cache (sysctl vm.drop_caches=1, fadvise "DONTNEED",
> umount and mount again, "reboot").
> 
> Cause significant read ahead.
> 
> Large read ahead request is split by blk_queue_split().
> Parts of the read ahead that are already in the stripe cache,
> or find an available stripe cache to use, can be serviced.
> Parts of the read ahead that would need "too much work",
> would need to wait for a "stripe_head" to become available,
> are rejected immediately.
> 
> For larger read ahead requests that are split in many pieces, it is very
> likely that some "splits" will be serviced, but then the stripe cache is
> exhausted/busy, and the remaining ones will be rejected.
> 
> Signed-off-by: Lars Ellenberg <lars.ellenberg@xxxxxxxxxx>
> Signed-off-by: Christoph Böhmwalder <christoph.boehmwalder@xxxxxxxxxx>
> Cc: <stable@xxxxxxxxxxxxxxx> # 4.13.x
> ---
>  drivers/block/drbd/drbd_req.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/block/drbd/drbd_req.c b/drivers/block/drbd/drbd_req.c
> index 9398c2c2cb2d..a384a58de1fd 100644
> --- a/drivers/block/drbd/drbd_req.c
> +++ b/drivers/block/drbd/drbd_req.c
> @@ -180,7 +180,8 @@ void start_new_tl_epoch(struct drbd_connection *connection)
>  void complete_master_bio(struct drbd_device *device,
>  		struct bio_and_error *m)
>  {
> -	m->bio->bi_status = errno_to_blk_status(m->error);
> +	if (unlikely(m->error))
> +		m->bio->bi_status = errno_to_blk_status(m->error);
>  	bio_endio(m->bio);
>  	dec_ap_bio(device);
>  }
> -- 
> 2.26.3
> 

<formletter>

This is not the correct way to submit patches for inclusion in the
stable kernel tree.  Please read:
    https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
for how to do this properly.

</formletter>



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux