On 2020/04/07 8:24, Chaitanya Kulkarni wrote: > One of the common application for the file systems and drivers to map > the buffer to the bio and issue I/Os on the block device. Drivers generally do not do this. At least not lower ones (scsi, nvme, etc). I guess you are referring to DM drivers. If yes, better be clear about this. Also, "the buffer" is a little unclear. Better qualify that. Another thing: "to map the buffer to the bio" is a little strange. The reverse seems more natural: a bio maps a buffer. > > This is a prep patch which adds two helper functions for block layer > which allows file-systems and the drivers to map data buffer on to the > bios and issue bios synchronously using blkdev_issue_rw() or > asynchronously using __blkdev_issue_rw(). The function names are basically the same but do completely different things. Better rename that. May be blkdev_issue_rw_sync() and blkdev_issue_rw_async() ? > > Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@xxxxxxx> > --- > block/blk-lib.c | 105 +++++++++++++++++++++++++++++++++++++++++ > include/linux/blkdev.h | 7 +++ > 2 files changed, 112 insertions(+) > > diff --git a/block/blk-lib.c b/block/blk-lib.c > index 5f2c429d4378..451c367fc0d6 100644 > --- a/block/blk-lib.c > +++ b/block/blk-lib.c > @@ -405,3 +405,108 @@ int blkdev_issue_zeroout(struct block_device *bdev, sector_t sector, > return ret; > } > EXPORT_SYMBOL(blkdev_issue_zeroout); > + > +/** > + * __blkdev_ssue_rw - issue read/write bios from buffer asynchronously s/__blkdev_ssue_rw/__blkdev_issue_rw > + * @bdev: blockdev to read/write > + * @buf: data buffer > + * @sector: start sector > + * @nr_sects: number of sectors > + * @op: REQ_OP_READ or REQ_OP_WRITE > + * @opf: flags > + * @gfp_mask: memory allocation flags (for bio_alloc) > + * @biop: pointer to anchor bio > + * > + * Description: > + * Generic helper function to map data buffer into bios for read and write ops. > + * Returns pointer to the anchored last bio for caller to submit asynchronously > + * or synchronously. > + */ > +int __blkdev_issue_rw(struct block_device *bdev, char *buf, sector_t sector, > + sector_t nr_sects, unsigned op, unsigned opf, > + gfp_t gfp_mask, struct bio **biop) > +{ > + bool vm = is_vmalloc_addr(buf) ? true : false; You do not need the clunky "? true : false" here. is_vmalloc_addr() returns a bool already. > + struct bio *bio = *biop; > + unsigned int nr_pages; > + struct page *page; > + unsigned int off; > + unsigned int len; > + int bi_size; > + > + if (!bdev_get_queue(bdev)) > + return -ENXIO; > + > + if (bdev_read_only(bdev)) > + return -EPERM; One can read a read-only device. So the check here is not correct. You need a "&& op == REQ_OP_WRITE" in the condition. > + > + if (!(op == REQ_OP_READ || op == REQ_OP_WRITE)) > + return -EINVAL; This probably should be checked before read-only. > + > + while (nr_sects != 0) { > + nr_pages = __blkdev_sectors_to_bio_pages(nr_sects); > + > + bio = blk_next_bio(bio, nr_pages, gfp_mask); > + bio->bi_iter.bi_sector = sector; > + bio_set_dev(bio, bdev); > + bio_set_op_attrs(bio, op, 0); > + > + while (nr_sects != 0) { > + off = offset_in_page(buf); > + page = vm ? vmalloc_to_page(buf) : virt_to_page(buf); > + len = min((sector_t) PAGE_SIZE, nr_sects << 9); > + > + bi_size = bio_add_page(bio, page, len, off); The variable naming is super confusing. bio_add_page() returns 0 if nothing is added and len if the page was added. So bi_size as a var name is not the best of name in my opinion. > + > + nr_sects -= bi_size >> 9; > + sector += bi_size >> 9; > + buf += bi_size; > + > + if (bi_size < len) > + break; You will get either 0 or len from bio_add_page. So the check here is not ideal. I think it is better to move it up under bio_add_page() and make it: len = bioa_add_page(bio, page, len, off); if (!len) break; > + } > + cond_resched(); > + } > + *biop = bio; > + return 0; > +} > +EXPORT_SYMBOL_GPL(__blkdev_issue_rw); > + > +/** > + * blkdev_execute_rw_sync - issue read/write bios from buffer synchronously > + * @bdev: blockdev to read/write > + * @buf: data buffer > + * @sector: start sector > + * @count: number of bytes > + * @op: REQ_OP_READ or REQ_OP_WRITE > + * @opf: flags > + * @gfp_mask: memory allocation flags (for bio_alloc) > + * > + * Description: > + * Generic helper function to map data buffer buffer into bios for read and > + * write requests. > + */ > +int blkdev_issue_rw(struct block_device *b, char *buf, sector_t sector, > + unsigned count, unsigned op, unsigned opf, gfp_t mask) function name differs between description and declaration. blkdev_execute_rw_sync() is better than blkdev_issue_rw() in my opinion. > +{ > + unsigned int is_vmalloc = is_vmalloc_addr(buf); > + sector_t nr_sects = count >> 9; > + struct bio *bio = NULL; > + int error; > + > + if (is_vmalloc && op == REQ_OP_WRITE) > + flush_kernel_vmap_range(buf, count); > + > + opf |= REQ_SYNC; You can add this directly in the call below. > + error = __blkdev_issue_rw(b, buf, sector, nr_sects, op, opf, mask, &bio); > + if (!error && bio) { > + error = submit_bio_wait(bio); > + bio_put(bio); > + } > + > + if (is_vmalloc && op == REQ_OP_READ) > + invalidate_kernel_vmap_range(buf, count); > + > + return error; > +} > +EXPORT_SYMBOL_GPL(blkdev_issue_rw); > diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h > index 32868fbedc9e..cb315b301ad9 100644 > --- a/include/linux/blkdev.h > +++ b/include/linux/blkdev.h > @@ -1248,6 +1248,13 @@ static inline int sb_issue_zeroout(struct super_block *sb, sector_t block, > gfp_mask, 0); > } > > +extern int __blkdev_issue_rw(struct block_device *bdev, char *buf, > + sector_t sector, sector_t nr_sects, unsigned op, > + unsigned opf, gfp_t gfp_mask, struct bio **biop); > +extern int blkdev_issue_rw(struct block_device *b, char *buf, sector_t sector, > + unsigned count, unsigned op, unsigned opf, > + gfp_t mask); No need for the extern I think. > + > extern int blk_verify_command(unsigned char *cmd, fmode_t mode); > > enum blk_default_limits { > -- Damien Le Moal Western Digital Research