On Sun, Oct 30, 2022 at 08:50:32AM +0100, Christoph Hellwig wrote: > On Sat, Oct 29, 2022 at 05:45:45PM +0100, Al Viro wrote: > > completion (in NULL_Q_BIO case, at least). What happens if request > > gets split and split-off part finishes first with an error? AFAICS, > > its ->bi_status will be copied to parent (original bio, the one that > > covers the tail). Now the IO on the original bio is over as well > > and we hit drivers/block/null_blk/main.c:end_cmd(). Suppose this > > part succeeds; won't we end up overwriting ->bi_status with zero > > and assuming that the entire thing had succeeded, despite the > > (now lost) error on the split-off part? > > As a rule of thumb drives should never set bi_status to 0, so null_blk > here has a bug. What is missing everywhere is proper memory barriers, > though. Something like static inline void set_bio_status(struct bio *bio, blk_status_t status) { if (unlikely(status)) cmpxchg(&bio->bi_status, 0, status); } with e.g. if (bio->bi_status && !dio->bio.bi_status) dio->bio.bi_status = bio->bi_status; in blkdev_bio_end_io() replaced with set_bio_status(&dio->bio, bio->bi_status); perhaps? That would probably do for almost all users, but... what about e.g. drivers/md/raid1.c:fix_sync_read_error()? Looks like it really intends non-zero -> zero change; I'm not familiar enough with the guts of that sucker to tell if it is guaranteed to get no propagation from another bio...