Re: [PATCH 1/2] block: add bio based rw helper for data buffer

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 04/06/2020 05:28 PM, Damien Le Moal wrote:
> On 2020/04/07 8:24, Chaitanya Kulkarni wrote:
>> One of the common application for the file systems and drivers to map
>> the buffer to the bio and issue I/Os on the block device.
>
> Drivers generally do not do this. At least not lower ones (scsi, nvme, etc). I
> guess you are referring to DM drivers. If yes, better be clear about this.
> Also, "the buffer" is a little unclear. Better qualify that.
>
Agree most drivers deals with sg_list. Right now there is only one such
an example I know which is rnbd as driver as it is mentioned in the
cover-letter references.
> Another thing: "to map the buffer to the bio" is a little strange. The reverse
> seems more natural: a bio maps a buffer.
>
Make sense, will do.
>>
>> This is a prep patch which adds two helper functions for block layer
>> which allows file-systems and the drivers to map data buffer on to the
>> bios and issue bios synchronously using blkdev_issue_rw() or
>> asynchronously using __blkdev_issue_rw().
>
> The function names are basically the same but do completely different things.
> Better rename that. May be blkdev_issue_rw_sync() and blkdev_issue_rw_async() ?
>
 >

This is exactly I named functions to start with (see the function
documentation which I missed to update), but the pattern in blk-lib.c
uses no such sync and async postfix, which is used is sync and async
context all over the kernel :-

# grep EXPORT block/blk-lib.c
EXPORT_SYMBOL(__blkdev_issue_discard);
EXPORT_SYMBOL(blkdev_issue_discard);, since as it is nr_sects and sector 
calculation doesn't matter after break.
EXPORT_SYMBOL(blkdev_issue_write_same);
EXPORT_SYMBOL(__blkdev_issue_zeroout);
EXPORT_SYMBOL(blkdev_issue_zeroout);
EXPORT_SYMBOL_GPL(__blkdev_issue_rw);
EXPORT_SYMBOL_GPL(blkdev_issue_rw);

In absence of documentation for naming how about we just follow
existing naming convention ?

Which matches the pattern for new API.

>>
>> Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@xxxxxxx>
>> ---
>>   block/blk-lib.c        | 105 +++++++++++++++++++++++++++++++++++++++++
>>   include/linux/blkdev.h |   7 +++
>>   2 files changed, 112 insertions(+)
>>
>> diff --git a/block/blk-lib.c b/block/blk-lib.c
>> index 5f2c429d4378..451c367fc0d6 100644
>> --- a/block/blk-lib.c
>> +++ b/block/blk-lib.c
>> @@ -405,3 +405,108 @@ int blkdev_issue_zeroout(struct block_device *bdev, sector_t sector,
>>   	return ret;
>>   }
>>   EXPORT_SYMBOL(blkdev_issue_zeroout);
>> +
>> +/**
>> + * __blkdev_ssue_rw - issue read/write bios from buffer asynchronously
>
> s/__blkdev_ssue_rw/__blkdev_issue_rw
Thanks, will fix.
>
>> + * @bdev:	blockdev to read/write
>> + * @buf:	data buffer
>> + * @sector:	start sector
>> + * @nr_sects:	number of sectors
>> + * @op:	REQ_OP_READ or REQ_OP_WRITE
>> + * @opf:	flags
>> + * @gfp_mask:	memory allocation flags (for bio_alloc)
>> + * @biop:	pointer to anchor bio
>> + *
>> + * Description:
>> + *  Generic helper function to map data buffer into bios for read and write ops.
>> + *  Returns pointer to the anchored last bio for caller to submit asynchronously
>> + *  or synchronously.
>> + */
>> +int __blkdev_issue_rw(struct block_device *bdev, char *buf, sector_t sector,
>> +		      sector_t nr_sects, unsigned op, unsigned opf,
>> +		      gfp_t gfp_mask, struct bio **biop)
>> +{
>> +	bool vm = is_vmalloc_addr(buf) ? true : false;
>
> You do not need the clunky "? true : false" here. is_vmalloc_addr() returns a
> bool already.
>
I will remove it.
>> +	struct bio *bio = *biop;
>> +	unsigned int nr_pages;
>> +	struct page *page;
>> +	unsigned int off;
>> +	unsigned int len;
>> +	int bi_size;
>> +
>> +	if (!bdev_get_queue(bdev))
>> +		return -ENXIO;
>> +
>> +	if (bdev_read_only(bdev))
>> +		return -EPERM;
>
> One can read a read-only device. So the check here is not correct. You need a
> "&& op == REQ_OP_WRITE" in the condition.
>
True, this shouldn't be here, also I'm thinking of lifting these checks 
to caller, we can add it later if needed.
>> +
>> +	if (!(op == REQ_OP_READ || op == REQ_OP_WRITE))
>> +		return -EINVAL;
>
> This probably should be checked before read-only.
>
Yes.
>> +
>> +	while (nr_sects != 0) {
>> +		nr_pages = __blkdev_sectors_to_bio_pages(nr_sects);
>> +
>> +		bio = blk_next_bio(bio, nr_pages, gfp_mask);
>> +		bio->bi_iter.bi_sector = sector;
>> +		bio_set_dev(bio, bdev);
>> +		bio_set_op_attrs(bio, op, 0);
>> +
>> +		while (nr_sects != 0) {
>> +			off = offset_in_page(buf);
>> +			page = vm ? vmalloc_to_page(buf) : virt_to_page(buf);
>> +			len = min((sector_t) PAGE_SIZE, nr_sects << 9);
>> +
>> +			bi_size = bio_add_page(bio, page, len, off);
>
> The variable naming is super confusing. bio_add_page() returns 0 if nothing is
> added and len if the page was added. So bi_size as a var name is not the best of
> name in my opinion.
>
Here bio, page, len, off variables are passed to the bio_add_page() 
function, which has the same names in the function parameter list.
(I'll fix the off to offset)

Regarding the bi_size, given that bio_add_page() returns 
bio->bi_iter.bi_size, isn't bi_size also maps to the what function is 
returning and explains what we are dealing with?

Also, bi_size is used in the blk-lib.c: __blkdev_issue_zero_pages(), if 
that is confusing then we need to change that, what is your preference?

I'll send a cleanup patch __blkdev_issue_zero_pages() also.

>> +
>> +			nr_sects -= bi_size >> 9;
>> +			sector += bi_size >> 9;
>> +			buf += bi_size;
>> +
>> +			if (bi_size < len)
>> +				break;
>
> You will get either 0 or len from bio_add_page. So the check here is not ideal.
> I think it is better to move it up under bio_add_page() and make it:
>
> 			len = bioa_add_page(bio, page, len, off);
> 			if (!len)
> 				break;
I'd still like to keep bi_size, I think I had received a comment about 
using same variable for function call and updating as a return value.

Regarding the check, will move it.
>
>> +		}
>> +		cond_resched();
>> +	}
>> +	*biop = bio;
>> +	return 0;
>> +}
>> +EXPORT_SYMBOL_GPL(__blkdev_issue_rw);
>> +
>> +/**
>> + * blkdev_execute_rw_sync - issue read/write bios from buffer synchronously
>> + * @bdev:	blockdev to read/write
>> + * @buf:	data buffer
>> + * @sector:	start sector
>> + * @count:	number of bytes
>> + * @op:	REQ_OP_READ or REQ_OP_WRITE
>> + * @opf:	flags
>> + * @gfp_mask:	memory allocation flags (for bio_alloc)
>> + *
>> + * Description:
>> + *  Generic helper function to map data buffer buffer into bios for read and
>> + *  write requests.
>> + */
>> +int blkdev_issue_rw(struct block_device *b, char *buf, sector_t sector,
>> +		     unsigned count, unsigned op, unsigned opf, gfp_t mask)
>
> function name differs between description and declaration.
> blkdev_execute_rw_sync() is better than blkdev_issue_rw() in my opinion.
>
That was remained from initial version, will fix it, thanks for
pointing it out.
>> +{
>> +	unsigned int is_vmalloc = is_vmalloc_addr(buf);
>> +	sector_t nr_sects = count >> 9;
>> +	struct bio *bio = NULL;
>> +	int error;
>> +
>> +	if (is_vmalloc && op == REQ_OP_WRITE)
>> +		flush_kernel_vmap_range(buf, count);
>> +
>> +	opf |= REQ_SYNC;
>
> You can add this directly in the call below.
>
Breaks calling of __blkdev_issue_rw() to new line. This also adds a new
line which is unavoidable but keeps the function call smooth in one
line.
>> +	error = __blkdev_issue_rw(b, buf, sector, nr_sects, op, opf, mask, &bio);
>> +	if (!error && bio) {
>> +		error = submit_bio_wait(bio);
>> +		bio_put(bio);
>> +	}
>> +
>> +	if (is_vmalloc && op == REQ_OP_READ)
>> +		invalidate_kernel_vmap_range(buf, count);
>> +
>> +	return error;
>> +}
>> +EXPORT_SYMBOL_GPL(blkdev_issue_rw);
>> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
>> index 32868fbedc9e..cb315b301ad9 100644
>> --- a/include/linux/blkdev.h
>> +++ b/include/linux/blkdev.h
>> @@ -1248,6 +1248,13 @@ static inline int sb_issue_zeroout(struct super_block *sb, sector_t block,
>>   				    gfp_mask, 0);
>>   }
>>
>> +extern int __blkdev_issue_rw(struct block_device *bdev, char *buf,
>> +			     sector_t sector, sector_t nr_sects, unsigned op,
>> +			     unsigned opf, gfp_t gfp_mask, struct bio **biop);
>> +extern int blkdev_issue_rw(struct block_device *b, char *buf, sector_t sector,
>> +			   unsigned count, unsigned op, unsigned opf,
>> +			   gfp_t mask);
>
> No need for the extern I think.
>
Again, following the exiting pattern in the file for the header:-

  # grep blkdev_issue include/linux/blkdev.h
extern int blkdev_issue_flush(struct block_device *, gfp_t, sector_t *);

extern int blkdev_issue_write_same(struct block_device *bdev, sector_t
sector,

extern int blkdev_issue_discard(struct block_device *bdev, sector_t
sector,

extern int __blkdev_issue_discard(struct block_device *bdev, sector_t
sector,

extern int __blkdev_issue_zeroout(struct block_device *bdev, sector_t
sector,

extern int blkdev_issue_zeroout(struct block_device *bdev, sector_t
sector,

extern int __blkdev_issue_rw(struct block_device *bdev, char *buf,

extern int blkdev_issue_rw(struct block_device *b, char *buf, sector_t
sector,
>> +
>>   extern int blk_verify_command(unsigned char *cmd, fmode_t mode);
>>
>>   enum blk_default_limits {
>>
>
>





[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux