On Thu, Mar 30, 2023 at 11:18 AM Christian Brauner <brauner@xxxxxxxxxx> wrote: > > On Wed, Mar 29, 2023 at 06:12:36PM +0530, Nitesh Shetty wrote: > > On Wed, Mar 29, 2023 at 02:14:40PM +0200, Christian Brauner wrote: > > > On Mon, Mar 27, 2023 at 02:10:52PM +0530, Anuj Gupta wrote: > > > > From: Nitesh Shetty <nj.shetty@xxxxxxxxxxx> > > > > > > > > For direct block device opened with O_DIRECT, use copy_file_range to > > > > issue device copy offload, and fallback to generic_copy_file_range incase > > > > device copy offload capability is absent. > > > > Modify checks to allow bdevs to use copy_file_range. > > > > > > > > Suggested-by: Ming Lei <ming.lei@xxxxxxxxxx> > > > > Signed-off-by: Anuj Gupta <anuj20.g@xxxxxxxxxxx> > > > > Signed-off-by: Nitesh Shetty <nj.shetty@xxxxxxxxxxx> > > > > --- > > > > block/blk-lib.c | 22 ++++++++++++++++++++++ > > > > block/fops.c | 20 ++++++++++++++++++++ > > > > fs/read_write.c | 11 +++++++++-- > > > > include/linux/blkdev.h | 3 +++ > > > > 4 files changed, 54 insertions(+), 2 deletions(-) > > > > > > > > diff --git a/block/blk-lib.c b/block/blk-lib.c > > > > index a21819e59b29..c288573c7e77 100644 > > > > --- a/block/blk-lib.c > > > > +++ b/block/blk-lib.c > > > > @@ -475,6 +475,28 @@ static inline bool blk_check_copy_offload(struct request_queue *q_in, > > > > return blk_queue_copy(q_in) && blk_queue_copy(q_out); > > > > } > > > > > > > > +int blkdev_copy_offload(struct block_device *bdev_in, loff_t pos_in, > > > > + struct block_device *bdev_out, loff_t pos_out, size_t len, > > > > + cio_iodone_t end_io, void *private, gfp_t gfp_mask) > > > > +{ > > > > + struct request_queue *in_q = bdev_get_queue(bdev_in); > > > > + struct request_queue *out_q = bdev_get_queue(bdev_out); > > > > + int ret = -EINVAL; > > > > > > Why initialize to -EINVAL if blk_copy_sanity_check() initializes it > > > right away anyway? > > > > > > > acked. > > > > > > + bool offload = false; > > > > > > Same thing with initializing offload. > > > > > acked > > > > > > + > > > > + ret = blk_copy_sanity_check(bdev_in, pos_in, bdev_out, pos_out, len); > > > > + if (ret) > > > > + return ret; > > > > + > > > > + offload = blk_check_copy_offload(in_q, out_q); > > > > + if (offload) > > > > + ret = __blk_copy_offload(bdev_in, pos_in, bdev_out, pos_out, > > > > + len, end_io, private, gfp_mask); > > > > + > > > > + return ret; > > > > +} > > > > +EXPORT_SYMBOL_GPL(blkdev_copy_offload); > > > > + > > > > /* > > > > * @bdev_in: source block device > > > > * @pos_in: source offset > > > > diff --git a/block/fops.c b/block/fops.c > > > > index d2e6be4e3d1c..3b7c05831d5c 100644 > > > > --- a/block/fops.c > > > > +++ b/block/fops.c > > > > @@ -611,6 +611,25 @@ static ssize_t blkdev_read_iter(struct kiocb *iocb, struct iov_iter *to) > > > > return ret; > > > > } > > > > > > > > +static ssize_t blkdev_copy_file_range(struct file *file_in, loff_t pos_in, > > > > + struct file *file_out, loff_t pos_out, > > > > + size_t len, unsigned int flags) > > > > +{ > > > > + struct block_device *in_bdev = I_BDEV(bdev_file_inode(file_in)); > > > > + struct block_device *out_bdev = I_BDEV(bdev_file_inode(file_out)); > > > > + int comp_len = 0; > > > > + > > > > + if ((file_in->f_iocb_flags & IOCB_DIRECT) && > > > > + (file_out->f_iocb_flags & IOCB_DIRECT)) > > > > + comp_len = blkdev_copy_offload(in_bdev, pos_in, out_bdev, > > > > + pos_out, len, NULL, NULL, GFP_KERNEL); > > > > + if (comp_len != len) > > > > + comp_len = generic_copy_file_range(file_in, pos_in + comp_len, > > > > + file_out, pos_out + comp_len, len - comp_len, flags); > > > > > > I'm not deeply familiar with this code but this looks odd. It at least > > > seems possible that comp_len could be -EINVAL and len 20 at which point > > > you'd be doing len - comp_len aka 20 - 22 = -2 in generic_copy_file_range(). > > 20 - -22 = 44 ofc > > > > > comp_len should be 0 incase of error. We do agree, some function > > I mean, not to hammer on this point too much but just to be clear > blk_copy_sanity_check(), which is introduced in the second patch, can > return both -EPERM and -EINVAL and is first called in > blkdev_copy_offload() so it's definitely possible for comp_len to be > negative. Acked. Will be updated in the next version. Thank you, Nitesh Shetty