On Thu, Apr 28, 2022 at 07:04:13AM +0900, Damien Le Moal wrote: > On 4/28/22 00:15, Nitesh Shetty wrote: > > On Wed, Apr 27, 2022 at 11:45:26AM +0900, Damien Le Moal wrote: > >> On 4/26/22 19:12, Nitesh Shetty wrote: > >>> Introduce blkdev_issue_copy which supports source and destination bdevs, > >>> and an array of (source, destination and copy length) tuples. > >>> Introduce REQ_COPY copy offload operation flag. Create a read-write > >>> bio pair with a token as payload and submitted to the device in order. > >>> Read request populates token with source specific information which > >>> is then passed with write request. > >>> This design is courtesy Mikulas Patocka's token based copy > >>> > >>> Larger copy will be divided, based on max_copy_sectors, > >>> max_copy_range_sector limits. > >>> > >>> Signed-off-by: Nitesh Shetty <nj.shetty@xxxxxxxxxxx> > >>> Signed-off-by: Arnav Dawn <arnav.dawn@xxxxxxxxxxx> > >>> --- > >>> block/blk-lib.c | 232 ++++++++++++++++++++++++++++++++++++++ > >>> block/blk.h | 2 + > >>> include/linux/blk_types.h | 21 ++++ > >>> include/linux/blkdev.h | 2 + > >>> include/uapi/linux/fs.h | 14 +++ > >>> 5 files changed, 271 insertions(+) > >>> > >>> diff --git a/block/blk-lib.c b/block/blk-lib.c > >>> index 09b7e1200c0f..ba9da2d2f429 100644 > >>> --- a/block/blk-lib.c > >>> +++ b/block/blk-lib.c > >>> @@ -117,6 +117,238 @@ int blkdev_issue_discard(struct block_device *bdev, sector_t sector, > >>> } > >>> EXPORT_SYMBOL(blkdev_issue_discard); > >>> > >>> +/* > >>> + * Wait on and process all in-flight BIOs. This must only be called once > >>> + * all bios have been issued so that the refcount can only decrease. > >>> + * This just waits for all bios to make it through bio_copy_end_io. IO > >>> + * errors are propagated through cio->io_error. > >>> + */ > >>> +static int cio_await_completion(struct cio *cio) > >>> +{ > >>> + int ret = 0; > >>> + unsigned long flags; > >>> + > >>> + spin_lock_irqsave(&cio->lock, flags); > >>> + if (cio->refcount) { > >>> + cio->waiter = current; > >>> + __set_current_state(TASK_UNINTERRUPTIBLE); > >>> + spin_unlock_irqrestore(&cio->lock, flags); > >>> + blk_io_schedule(); > >>> + /* wake up sets us TASK_RUNNING */ > >>> + spin_lock_irqsave(&cio->lock, flags); > >>> + cio->waiter = NULL; > >>> + ret = cio->io_err; > >>> + } > >>> + spin_unlock_irqrestore(&cio->lock, flags); > >>> + kvfree(cio); > >> > >> cio is allocated with kzalloc() == kmalloc(). So why the kvfree() here ? > >> > > > > acked. > > > >>> + > >>> + return ret; > >>> +} > >>> + > >>> +static void bio_copy_end_io(struct bio *bio) > >>> +{ > >>> + struct copy_ctx *ctx = bio->bi_private; > >>> + struct cio *cio = ctx->cio; > >>> + sector_t clen; > >>> + int ri = ctx->range_idx; > >>> + unsigned long flags; > >>> + bool wake = false; > >>> + > >>> + if (bio->bi_status) { > >>> + cio->io_err = bio->bi_status; > >>> + clen = (bio->bi_iter.bi_sector << SECTOR_SHIFT) - ctx->start_sec; > >>> + cio->rlist[ri].comp_len = min_t(sector_t, clen, cio->rlist[ri].comp_len); > >> > >> long line. > > > > Is it because line is more than 80 character, I thought limit is 100 now, so > > went with longer lines ? > > When it is easy to wrap the lines without readability loss, please do to > keep things under 80 char per line. > > acked > >>> +{ > >>> + struct request_queue *src_q = bdev_get_queue(src_bdev); > >>> + struct request_queue *dest_q = bdev_get_queue(dest_bdev); > >>> + int ret = -EINVAL; > >>> + > >>> + if (!src_q || !dest_q) > >>> + return -ENXIO; > >>> + > >>> + if (!nr) > >>> + return -EINVAL; > >>> + > >>> + if (nr >= MAX_COPY_NR_RANGE) > >>> + return -EINVAL; > >> > >> Where do you check the number of ranges against what the device can do ? > >> > > > > The present implementation submits only one range at a time. This was done to > > make copy offload generic, so that other types of copy implementation such as > > XCOPY should be able to use same infrastructure. Downside at present being > > NVMe copy offload is not optimal. > > If you issue one range at a time without checking the number of ranges, > what is the point of the nr ranges queue limit ? The user can submit a > copy ioctl request exceeding it. Please use that limit and enforce it or > remove it entirely. > Sure, will remove this limit in next version. -- Thank you Nitesh Shetty