On 03/26/2020 07:46 AM, Christoph Hellwig wrote: > On Thu, Mar 26, 2020 at 10:34:42AM -0400, Martin K. Petersen wrote: >>> I just worry about the proliferation of identical merging and >>> splitting code throughout the block stack as we add additional >>> single-range, no payload operations (Verify, etc.). I prefer to >>> enforce the semantics in the LLD and not in the plumbing. But I >>> won't object to a separate REQ_OP_ALLOCATE if you find the >>> resulting code duplication acceptable. > I find it acceptable for now. And I think we should find some way > (e.g. by being table driven) to share code between differnet > opcodes. > With reference to Martin's comment (verify etc) there is a significant advantage when using payloadless bio to offload the functionality to the directly attached device and over the fabrics when dealing with larger disks. How about we create a helper which is independent of the operations can accept req_op and issues the payloadless bios. Something like following totally untested :- diff --git a/block/blk-lib.c b/block/blk-lib.c index cf9e75a730b4..d3fccd3211cc 100644 --- a/block/blk-lib.c +++ b/block/blk-lib.c @@ -209,6 +209,33 @@ int blkdev_issue_write_same(struct block_device *bdev, sector_t sector, } EXPORT_SYMBOL(blkdev_issue_write_same); +static void __blkdev_issue_payloadless(struct block_device *bdev, unsigned op, + sector_t sector, sector_t nr_sects, gfp_t gfp_mask, + struct bio **biop, unsigned bio_opf, unsigned int max_sectors) +{ + struct bio *bio = *biop; + + while (nr_sects) { + bio = blk_next_bio(bio, 0, gfp_mask); + bio->bi_iter.bi_sector = sector; + bio_set_dev(bio, bdev); + bio->bi_opf = op; + bio->bi_opf |= bio_opf; + + if (nr_sects > max_sectors) { + bio->bi_iter.bi_size = max_sectors << 9; + nr_sects -= max_sectors; + sector += max_sectors; + } else { + bio->bi_iter.bi_size = nr_sects << 9; + nr_sects = 0; + } + cond_resched(); + } + + *biop = bio; +} + static int __blkdev_issue_write_zeroes(struct block_device *bdev, sector_t sector, sector_t nr_sects, gfp_t gfp_mask, struct bio **biop, unsigned flags) @@ -216,6 +243,7 @@ static int __blkdev_issue_write_zeroes(struct block_device *bdev, struct bio *bio = *biop; unsigned int max_write_zeroes_sectors; struct request_queue *q = bdev_get_queue(bdev); + unsigned int unmap = (flags & BLKDEV_ZERO_NOUNMAP) ? REQ_NOUNMAP : 0; if (!q) return -ENXIO; @@ -229,24 +257,8 @@ static int __blkdev_issue_write_zeroes(struct block_device *bdev, if (max_write_zeroes_sectors == 0) return -EOPNOTSUPP; - while (nr_sects) { - bio = blk_next_bio(bio, 0, gfp_mask); - bio->bi_iter.bi_sector = sector; - bio_set_dev(bio, bdev); - bio->bi_opf = REQ_OP_WRITE_ZEROES; - if (flags & BLKDEV_ZERO_NOUNMAP) - bio->bi_opf |= REQ_NOUNMAP; - - if (nr_sects > max_write_zeroes_sectors) { - bio->bi_iter.bi_size = max_write_zeroes_sectors << 9; - nr_sects -= max_write_zeroes_sectors; - sector += max_write_zeroes_sectors; - } else { - bio->bi_iter.bi_size = nr_sects << 9; - nr_sects = 0; - } - cond_resched(); - } + __blkdev_issue_payloadless(bdev, REQ_OP_WRITE_ZEROES, sector, nr_sects, + gfp_mask, biop, unmap, max_write_zeroes_sectors); *biop = bio; return 0; I'll be happy to send out a well tested patch based on the above suggestion or any feedback I get and re-spin this series or OP can re-spin this series whatever works.