Submit the bio fragment with the lowest LBA first. This approach prevents write errors when submitting large bios to host-managed zoned block devices. This patch only modifies the behavior of drivers that call bio_split_to_limits() directly. This includes DRBD, pktcdvd, dm, md and the NVMe multipath code. Cc: Christoph Hellwig <hch@xxxxxx> Cc: Ming Lei <ming.lei@xxxxxxxxxx> Cc: Damien Le Moal <damien.lemoal@xxxxxxxxxxxxxxxxxx> Cc: Johannes Thumshirn <johannes.thumshirn@xxxxxxx> Signed-off-by: Bart Van Assche <bvanassche@xxxxxxx> --- block/blk-merge.c | 23 ++++++++++++++++++----- 1 file changed, 18 insertions(+), 5 deletions(-) diff --git a/block/blk-merge.c b/block/blk-merge.c index d6f8552ef209..7281f2d91b2f 100644 --- a/block/blk-merge.c +++ b/block/blk-merge.c @@ -345,8 +345,8 @@ EXPORT_SYMBOL_GPL(bio_split_rw); * @nr_segs: returns the number of segments in the returned bio * * Check if @bio needs splitting based on the queue limits, and if so split off - * a bio fitting the limits from the beginning of @bio and return it. @bio is - * shortened to the remainder and re-submitted. + * a bio fitting the limits from the beginning of @bio. @bio is shortened to + * the remainder. * * The split bio is allocated from @q->bio_split, which is provided by the * block layer. @@ -379,10 +379,23 @@ struct bio *__bio_split_to_limits(struct bio *bio, split->bi_opf |= REQ_NOMERGE; blkcg_bio_issue_init(split); - bio_chain(split, bio); trace_block_split(split, bio->bi_iter.bi_sector); - submit_bio_noacct(bio); - return split; + if (current->bio_list) { + /* + * The caller will submit the first half ('split') + * before the second half ('bio'). + */ + bio_chain(split, bio); + submit_bio_noacct(bio); + return split; + } + /* + * Submit the first half ('split') let the caller submit the + * second half ('bio'). + */ + *nr_segs = bio_chain_nr_segments(bio, lim); + bio_chain(split, bio); + submit_bio_noacct(split); } return bio; }