On Wed, Mar 31, 2021 at 07:53:59AM -0400, Yufen Yu wrote: > For multiple split bios, if one of the bio is fail, the whole > should return error to application. But we found there is a race > between bio_integrity_verify_fn and bio complete, which return > io success to application after one of the bio fail. The race as > following: > > split bio(READ) kworker > > nvme_complete_rq > blk_update_request //split error=0 > bio_endio > bio_integrity_endio > queue_work(kintegrityd_wq, &bip->bip_work); > > bio_integrity_verify_fn > bio_endio //split bio > __bio_chain_endio > if (!parent->bi_status) > > <interrupt entry> > nvme_irq > blk_update_request //parent error=7 > req_bio_endio > bio->bi_status = 7 //parent bio > <interrupt exit> > > parent->bi_status = 0 > parent->bi_end_io() // return bi_status=0 > > The bio has been split as two: split and parent. When split > bio completed, it depends on kworker to do endio, while > bio_integrity_verify_fn have been interrupted by parent bio > complete irq handler. Then, parent bio->bi_status which have > been set in irq handler will overwrite by kworker. > > In fact, even without the above race, we also need to conside > the concurrency beteen mulitple split bio complete and update > the same parent bi_status. Normally, multiple split bios will > be issued to the same hctx and complete from the same irq > vector. But if we have updated queue map between multiple split > bios, these bios may complete on different hw queue and different > irq vector. Then the concurrency update parent bi_status may > cause the final status error. > > Suggested-by: Keith Busch <kbusch@xxxxxxxxxx> > Signed-off-by: Yufen Yu <yuyufen@xxxxxxxxxx> > --- > block/bio.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/block/bio.c b/block/bio.c > index 26b7f721cda8..f49713ff8ad3 100644 > --- a/block/bio.c > +++ b/block/bio.c > @@ -277,7 +277,7 @@ static struct bio *__bio_chain_endio(struct bio *bio) > { > struct bio *parent = bio->bi_private; > > - if (!parent->bi_status) > + if (bio->bi_status && !parent->bi_status) > parent->bi_status = bio->bi_status; > bio_put(bio); > return parent; > -- > 2.25.4 > Looks fine: Reviewed-by: Ming Lei <ming.lei@xxxxxxxxxx> -- Ming