Re: [PATCH 1/2] block: add bio based rw helper for data buffer

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2020/04/07 8:24, Chaitanya Kulkarni wrote:
> One of the common application for the file systems and drivers to map
> the buffer to the bio and issue I/Os on the block device.

Drivers generally do not do this. At least not lower ones (scsi, nvme, etc). I
guess you are referring to DM drivers. If yes, better be clear about this.
Also, "the buffer" is a little unclear. Better qualify that.

Another thing: "to map the buffer to the bio" is a little strange. The reverse
seems more natural: a bio maps a buffer.

> 
> This is a prep patch which adds two helper functions for block layer
> which allows file-systems and the drivers to map data buffer on to the
> bios and issue bios synchronously using blkdev_issue_rw() or
> asynchronously using __blkdev_issue_rw().

The function names are basically the same but do completely different things.
Better rename that. May be blkdev_issue_rw_sync() and blkdev_issue_rw_async() ?

> 
> Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@xxxxxxx>
> ---
>  block/blk-lib.c        | 105 +++++++++++++++++++++++++++++++++++++++++
>  include/linux/blkdev.h |   7 +++
>  2 files changed, 112 insertions(+)
> 
> diff --git a/block/blk-lib.c b/block/blk-lib.c
> index 5f2c429d4378..451c367fc0d6 100644
> --- a/block/blk-lib.c
> +++ b/block/blk-lib.c
> @@ -405,3 +405,108 @@ int blkdev_issue_zeroout(struct block_device *bdev, sector_t sector,
>  	return ret;
>  }
>  EXPORT_SYMBOL(blkdev_issue_zeroout);
> +
> +/**
> + * __blkdev_ssue_rw - issue read/write bios from buffer asynchronously

s/__blkdev_ssue_rw/__blkdev_issue_rw

> + * @bdev:	blockdev to read/write
> + * @buf:	data buffer
> + * @sector:	start sector
> + * @nr_sects:	number of sectors
> + * @op:	REQ_OP_READ or REQ_OP_WRITE
> + * @opf:	flags
> + * @gfp_mask:	memory allocation flags (for bio_alloc)
> + * @biop:	pointer to anchor bio
> + *
> + * Description:
> + *  Generic helper function to map data buffer into bios for read and write ops.
> + *  Returns pointer to the anchored last bio for caller to submit asynchronously
> + *  or synchronously.
> + */
> +int __blkdev_issue_rw(struct block_device *bdev, char *buf, sector_t sector,
> +		      sector_t nr_sects, unsigned op, unsigned opf,
> +		      gfp_t gfp_mask, struct bio **biop)
> +{
> +	bool vm = is_vmalloc_addr(buf) ? true : false;

You do not need the clunky "? true : false" here. is_vmalloc_addr() returns a
bool already.

> +	struct bio *bio = *biop;
> +	unsigned int nr_pages;
> +	struct page *page;
> +	unsigned int off;
> +	unsigned int len;
> +	int bi_size;
> +
> +	if (!bdev_get_queue(bdev))
> +		return -ENXIO;
> +
> +	if (bdev_read_only(bdev))
> +		return -EPERM;

One can read a read-only device. So the check here is not correct. You need a
"&& op == REQ_OP_WRITE" in the condition.

> +
> +	if (!(op == REQ_OP_READ || op == REQ_OP_WRITE))
> +		return -EINVAL;

This probably should be checked before read-only.

> +
> +	while (nr_sects != 0) {
> +		nr_pages = __blkdev_sectors_to_bio_pages(nr_sects);
> +
> +		bio = blk_next_bio(bio, nr_pages, gfp_mask);
> +		bio->bi_iter.bi_sector = sector;
> +		bio_set_dev(bio, bdev);
> +		bio_set_op_attrs(bio, op, 0);
> +
> +		while (nr_sects != 0) {
> +			off = offset_in_page(buf);
> +			page = vm ? vmalloc_to_page(buf) : virt_to_page(buf);
> +			len = min((sector_t) PAGE_SIZE, nr_sects << 9);
> +
> +			bi_size = bio_add_page(bio, page, len, off);

The variable naming is super confusing. bio_add_page() returns 0 if nothing is
added and len if the page was added. So bi_size as a var name is not the best of
name in my opinion.

> +
> +			nr_sects -= bi_size >> 9;
> +			sector += bi_size >> 9;
> +			buf += bi_size;
> +
> +			if (bi_size < len)
> +				break;

You will get either 0 or len from bio_add_page. So the check here is not ideal.
I think it is better to move it up under bio_add_page() and make it:

			len = bioa_add_page(bio, page, len, off);
			if (!len)
				break;

> +		}
> +		cond_resched();
> +	}
> +	*biop = bio;
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(__blkdev_issue_rw);
> +
> +/**
> + * blkdev_execute_rw_sync - issue read/write bios from buffer synchronously
> + * @bdev:	blockdev to read/write
> + * @buf:	data buffer
> + * @sector:	start sector
> + * @count:	number of bytes
> + * @op:	REQ_OP_READ or REQ_OP_WRITE
> + * @opf:	flags
> + * @gfp_mask:	memory allocation flags (for bio_alloc)
> + *
> + * Description:
> + *  Generic helper function to map data buffer buffer into bios for read and
> + *  write requests.
> + */
> +int blkdev_issue_rw(struct block_device *b, char *buf, sector_t sector,
> +		     unsigned count, unsigned op, unsigned opf, gfp_t mask)

function name differs between description and declaration.
blkdev_execute_rw_sync() is better than blkdev_issue_rw() in my opinion.

> +{
> +	unsigned int is_vmalloc = is_vmalloc_addr(buf);
> +	sector_t nr_sects = count >> 9;
> +	struct bio *bio = NULL;
> +	int error;
> +
> +	if (is_vmalloc && op == REQ_OP_WRITE)
> +		flush_kernel_vmap_range(buf, count);
> +
> +	opf |= REQ_SYNC;

You can add this directly in the call below.

> +	error = __blkdev_issue_rw(b, buf, sector, nr_sects, op, opf, mask, &bio);
> +	if (!error && bio) {
> +		error = submit_bio_wait(bio);
> +		bio_put(bio);
> +	}
> +
> +	if (is_vmalloc && op == REQ_OP_READ)
> +		invalidate_kernel_vmap_range(buf, count);
> +
> +	return error;
> +}
> +EXPORT_SYMBOL_GPL(blkdev_issue_rw);
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index 32868fbedc9e..cb315b301ad9 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -1248,6 +1248,13 @@ static inline int sb_issue_zeroout(struct super_block *sb, sector_t block,
>  				    gfp_mask, 0);
>  }
>  
> +extern int __blkdev_issue_rw(struct block_device *bdev, char *buf,
> +			     sector_t sector, sector_t nr_sects, unsigned op,
> +			     unsigned opf, gfp_t gfp_mask, struct bio **biop);
> +extern int blkdev_issue_rw(struct block_device *b, char *buf, sector_t sector,
> +			   unsigned count, unsigned op, unsigned opf,
> +			   gfp_t mask);

No need for the extern I think.

> +
>  extern int blk_verify_command(unsigned char *cmd, fmode_t mode);
>  
>  enum blk_default_limits {
> 


-- 
Damien Le Moal
Western Digital Research




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux