On Mon, Apr 26, 2021 at 06:17:08PM +0200, Christoph Böhmwalder wrote: > From: Lars Ellenberg <lars.ellenberg@xxxxxxxxxx> > > Scenario: > --------- > > bio chain generated by blk_queue_split(). > Some split bio fails and propagates its error status to the "parent" bio. > But then the (last part of the) parent bio itself completes without error. > > We would clobber the already recorded error status with BLK_STS_OK, > causing silent data corruption. > > Reproducer: > ----------- > > How to trigger this in the real world within seconds: > > DRBD on top of degraded parity raid, > small stripe_cache_size, large read_ahead setting. > Drop page cache (sysctl vm.drop_caches=1, fadvise "DONTNEED", > umount and mount again, "reboot"). > > Cause significant read ahead. > > Large read ahead request is split by blk_queue_split(). > Parts of the read ahead that are already in the stripe cache, > or find an available stripe cache to use, can be serviced. > Parts of the read ahead that would need "too much work", > would need to wait for a "stripe_head" to become available, > are rejected immediately. > > For larger read ahead requests that are split in many pieces, it is very > likely that some "splits" will be serviced, but then the stripe cache is > exhausted/busy, and the remaining ones will be rejected. > > Signed-off-by: Lars Ellenberg <lars.ellenberg@xxxxxxxxxx> > Signed-off-by: Christoph Böhmwalder <christoph.boehmwalder@xxxxxxxxxx> > Cc: <stable@xxxxxxxxxxxxxxx> # 4.13.x > --- > drivers/block/drbd/drbd_req.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/block/drbd/drbd_req.c b/drivers/block/drbd/drbd_req.c > index 9398c2c2cb2d..a384a58de1fd 100644 > --- a/drivers/block/drbd/drbd_req.c > +++ b/drivers/block/drbd/drbd_req.c > @@ -180,7 +180,8 @@ void start_new_tl_epoch(struct drbd_connection *connection) > void complete_master_bio(struct drbd_device *device, > struct bio_and_error *m) > { > - m->bio->bi_status = errno_to_blk_status(m->error); > + if (unlikely(m->error)) > + m->bio->bi_status = errno_to_blk_status(m->error); > bio_endio(m->bio); > dec_ap_bio(device); > } > -- > 2.26.3 > <formletter> This is not the correct way to submit patches for inclusion in the stable kernel tree. Please read: https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html for how to do this properly. </formletter>