(+cc Christoph) On 04/07/2020 04:28 PM, Dave Chinner wrote: > On Tue, Apr 07, 2020 at 08:06:35PM +0000, Chaitanya Kulkarni wrote: >> On 04/06/2020 05:32 PM, Damien Le Moal wrote: >>>> - >>>>> - do { >>>>> - struct page *page = kmem_to_page(data); >>>>> - unsigned int off = offset_in_page(data); >>>>> - unsigned int len = min_t(unsigned, left, PAGE_SIZE - off); >>>>> - >>>>> - while (bio_add_page(bio, page, len, off) != len) { >>>>> - struct bio *prev = bio; >>>>> - >>>>> - bio = bio_alloc(GFP_KERNEL, bio_max_vecs(left)); >>>>> - bio_copy_dev(bio, prev); >>>>> - bio->bi_iter.bi_sector = bio_end_sector(prev); >>>>> - bio->bi_opf = prev->bi_opf; >>>>> - bio_chain(prev, bio); >>>>> - >>>>> - submit_bio(prev); >>>>> - } >>>>> - >>>>> - data += len; >>>>> - left -= len; >>>>> - } while (left > 0); >>> Your helper could use a similar loop structure. This is very easy to read. >>> >> If I understand correctly this pattern is used since it is not a part of >> block layer. > > It's because it was simple and easy to understandi, not because of > the fact it is outside the core block layer. > > Us XFS folks are simple people who like simple things that are easy to > understand because there is so much of XFS that is so horrifically > complex that we want to implement simple stuff once and just not > have to care about it again. > Agree, new code should not make things complicated, in fact initial version had existing code pattern see [1]. Before answering the questions I think I should share few considerations I went through based on the code and requirement :- 1. The new helper is needed since XFS and rnbd is doing the same thing, i.e. mapping bios to buffer. The helper should be a part of the block layer library function, as most of the library APIs are present there, which are used by the file systems/network drivers. e.g. blkdev_issue_zeroout(). Summery :- Something like blkdev_issue_zeroout() is a good candidate since it is used in file-systems and network drivers. 2. Figure out the common code which can be moved to the library and identify what different APIs are needed for the functionality to save code duplication :- A. Synchronous version is needed for XFS. (without cond_resched() I need to fix this in next version). B. Asynchronous version is needed for network drivers rnbd as mentioned in the cover letter. Summery :- Add sync and async version. 3. Whenever a new function is added to the library we should not invent new patterns and follow the same style which is present in the library code i.e. in blk-lib.c, unless new API can't fit into current pattern. This is needed since block layer maintainers might complain about new design pattern. Summary :- See if new API can be designed identical to blkdev_issue_zeroout() and __blkdev_issue_zero_pages(). 4. Look at the existing user(s) (XFS) of the library (blk-lib.c) API(zeroout) and how it is using existing API if any. For XFS, which uses existing library function :- "fs/xfs/xfs_bmap_util.c:65: blkdev_issue_zeroout(target->bt_bdev" Summery :- Since the library(blk-lib.c) API(zeroout) is used in the user(XFS) that means it is acceptable to use similar code pattern. 5. Design a new API in the library based on the existing code. That :- A. Matches the code pattern which is present in the library, already used in the exiting user and reuse it. B. Make sure to have identical APIs parameter to keep the consistency whenever it is possible. (including APIs prototype parameters name etc). Summary :- Add blkdev_issue_rw() -> blkdev_issue_zeroout() and __blkdev_issue_rw() -> __blkdev_issue_zero_pages(). Regarding the number of lines, this is not aimed at having less number of lines in XFS only than existing xfs_rw_bdev() (although it does that), but since it is a library API it will save overall lines of the code (e.g. XFS 60 and rnbd ~50 and future uses) in the users like fs and network drivers similar to existing zeroout API. Please correct me if I'm wrong in any the above considerations, given that I'm not a XFS developer I'd like to make sure new API pattern is acceptable to both XFS developers and the block layer maintainers. I'm not stuck on using this pattern and I'll be happy to re-use the xfs_rw_bdev() pattern if block layer maintainers are okay with it. I'll wait for them to reply. >> The helpers in blk-lib.c are not accessible so this :- > > So export the helpers? > >> All above breaks the existing pattern and code reuse in blk-lib.c, since >> blk-lib.c:- >> 1. Already provides blk_next_bio() why repeat the bio allocation >> and bio chaining code ? > > So export the helper? > >> 2. Instead of adding a new helper bio_max_vecs() why not use existing >> __blkdev_sectors_to_bio_pages() ? > > That's not an improvement. The XFS code is _much_ easier to read > and understand. > >> 3. Why use two bios when it can be done with one bio with the helpers >> in blk-lib.c ? > > That helper is blk_next_bio(), which hides the second bio inside it. > So you aren't actually getting rid of the need for two bio pointers. > >> 4. Allows async version. > > Which is not used by XFS, so just adds complexity to this XFS path > for no good reason. > > Seriously, this 20 lines of code in XFS turns into 50-60 lines > of code in the block layer to do the same thing. How is that an > improvement? > > Cheers, > > Dave. > [1] Initial draft (for reference only) :- struct bio *__bio_execute_rw(struct block_device *bd, char *buf, sector_t sect, unsigned count, unsigned op, unsigned opf, gfp_t gfp_mask) { bool vm = is_vmalloc_addr(buf) ? true : false; struct request_queue *q = bd->bd_queue; unsigned int left = count; struct page *page; struct bio *new_bio; blk_qc_t cookie; bool do_poll; new_bio = bio_alloc(gfp_mask, bio_max_vecs(left)); bio_set_dev(new_bio, bd); new_bio->bi_iter.bi_sector = sect; new_bio->bi_opf = op_opf | REQ_SYNC; do { unsigned int offset = offset_in_page(buf); unsigned int len = min_t(unsigned, left, PAGE_SIZE - offset); page = vm ? vmalloc_to_page(buf) : virt_to_page(buf); while (bio_add_page(new_bio, page, len, offset) != len) { struct bio *curr_bio = new_bio; new_bio = bio_alloc(gfp_mask, bio_max_vecs(left)); bio_copy_dev(new_bio, curr_bio); new_bio->bi_iter.bi_sector = bio_end_sector(curr_bio); new_bio->bi_opf = curr_bio->bi_opf; bio_chain(curr_bio, new_bio); cookie = submit_bio(curr_bio); } buf += len; left -= len; } while (left > 0); return new_bio; } EXPORT_SYMBOL(__bio_execute_rw); int bio_execute_rw_sync(struct block_device *bd, char *buf, sector_t sect, unsigned count, unsigned op, unsigned opf, gfp_t gfp_mask) { unsigned int is_vmalloc = is_vmalloc_addr(buf); struct bio *bio; int error; if (is_vmalloc && op == REQ_OP_WRITE) flush_kernel_vmap_range(buf, count); bio = __bio_execute_rw(bd, sect, count, buf, op, opf, mask); error = submit_bio_wait(bio); bio_put(bio); if (is_vmalloc && op == REQ_OP_READ) invalidate_kernel_vmap_range(buf, count); return error; } EXPORT_SYMBOL_GPL(bio_execute_rw_sync);