Re: [PATCH v2] vfs: prevent copy_file_range to copy across devices

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Amir Goldstein <amir73il@xxxxxxxxx> writes:

> On Mon, Feb 15, 2021 at 8:57 PM Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> wrote:
>>
>> On Mon, 2021-02-15 at 19:24 +0200, Amir Goldstein wrote:
>> > On Mon, Feb 15, 2021 at 6:53 PM Trond Myklebust <
>> > trondmy@xxxxxxxxxxxxxxx> wrote:
>> > >
>> > > On Mon, 2021-02-15 at 18:34 +0200, Amir Goldstein wrote:
>> > > > On Mon, Feb 15, 2021 at 5:42 PM Luis Henriques <
>> > > > lhenriques@xxxxxxx>
>> > > > wrote:
>> > > > >
>> > > > > Nicolas Boichat reported an issue when trying to use the
>> > > > > copy_file_range
>> > > > > syscall on a tracefs file.  It failed silently because the file
>> > > > > content is
>> > > > > generated on-the-fly (reporting a size of zero) and
>> > > > > copy_file_range
>> > > > > needs
>> > > > > to know in advance how much data is present.
>> > > > >
>> > > > > This commit restores the cross-fs restrictions that existed
>> > > > > prior
>> > > > > to
>> > > > > 5dae222a5ff0 ("vfs: allow copy_file_range to copy across
>> > > > > devices")
>> > > > > and
>> > > > > removes generic_copy_file_range() calls from ceph, cifs, fuse,
>> > > > > and
>> > > > > nfs.
>> > > > >
>> > > > > Fixes: 5dae222a5ff0 ("vfs: allow copy_file_range to copy across
>> > > > > devices")
>> > > > > Link:
>> > > > > https://lore.kernel.org/linux-fsdevel/20210212044405.4120619-1-drinkcat@xxxxxxxxxxxx/
>> > > > > Cc: Nicolas Boichat <drinkcat@xxxxxxxxxxxx>
>> > > > > Signed-off-by: Luis Henriques <lhenriques@xxxxxxx>
>> > > >
>> > > > Code looks ok.
>> > > > You may add:
>> > > >
>> > > > Reviewed-by: Amir Goldstein <amir73il@xxxxxxxxx>
>> > > >
>> > > > I agree with Trond that the first paragraph of the commit message
>> > > > could
>> > > > be improved.
>> > > > The purpose of this change is to fix the change of behavior that
>> > > > caused the regression.
>> > > >
>> > > > Before v5.3, behavior was -EXDEV and userspace could fallback to
>> > > > read.
>> > > > After v5.3, behavior is zero size copy.
>> > > >
>> > > > It does not matter so much what makes sense for CFR to do in this
>> > > > case (generic cross-fs copy).  What matters is that nobody asked
>> > > > for
>> > > > this change and that it caused problems.
>> > > >
>> > >
>> > > No. I'm saying that this patch should be NACKed unless there is a
>> > > real
>> > > explanation for why we give crap about this tracefs corner case and
>> > > why
>> > > it can't be fixed.
>> > >
>> > > There are plenty of reasons why copy offload across filesystems
>> > > makes
>> > > sense, and particularly when you're doing NAS. Clone just doesn't
>> > > cut
>> > > it when it comes to disaster recovery (whereas backup to a
>> > > different
>> > > storage unit does). If the client has to do the copy, then you're
>> > > effectively doubling the load on the server, and you're adding
>> > > potentially unnecessary network traffic (or at the very least you
>> > > are
>> > > doubling that traffic).
>> > >
>> >
>> > I don't understand the use case you are describing.
>> >
>> > Which filesystem types are you talking about for source and target
>> > of copy_file_range()?
>> >
>> > To be clear, the original change was done to support NFS/CIFS server-
>> > side
>> > copy and those should not be affected by this change.
>> >
>>
>> That is incorrect:
>>
>> ssize_t nfsd_copy_file_range(struct file *src, u64 src_pos, struct file
>> *dst,
>>  u64 dst_pos, u64 count)
>> {
>>
>>  /*
>>  * Limit copy to 4MB to prevent indefinitely blocking an nfsd
>>  * thread and client rpc slot. The choice of 4MB is somewhat
>>  * arbitrary. We might instead base this on r/wsize, or make it
>>  * tunable, or use a time instead of a byte limit, or implement
>>  * asynchronous copy. In theory a client could also recognize a
>>  * limit like this and pipeline multiple COPY requests.
>>  */
>>  count = min_t(u64, count, 1 << 22);
>>  return vfs_copy_file_range(src, src_pos, dst, dst_pos, count, 0);
>> }
>>
>> You are now explicitly changing the behaviour of knfsd when the source
>> and destination filesystem differ.
>>
>> For one thing, you are disallowing the NFSv4.2 copy offload use case of
>> copying from a local filesystem to a remote NFS server. However you are
>> also disallowing the copy from, say, an XFS formatted partition to an
>> ext4 partition.
>>
>
> Got it.

Ugh.  And I guess overlayfs may have a similar problem.

> This is easy to solve with a flag COPY_FILE_SPLICE (or something) that
> is internal to kernel users.
>
> FWIW, you may want to look at the loop in ovl_copy_up_data()
> for improvements to nfsd_copy_file_range().
>
> We can move the check out to copy_file_range syscall:
>
>         if (flags != 0)
>                 return -EINVAL;
>
> Leave the fallback from all filesystems and check for the
> COPY_FILE_SPLICE flag inside generic_copy_file_range().

Ok, the diff bellow is just to make sure I understood your suggestion.

The patch will also need to:

 - change nfs and overlayfs calls to vfs_copy_file_range() so that they
   use the new flag.

 - check flags in generic_copy_file_checks() to make sure only valid flags
   are used (COPY_FILE_SPLICE at the moment).

Also, where should this flag be defined?  include/uapi/linux/fs.h?

Cheers,
-- 
Luis

diff --git a/fs/read_write.c b/fs/read_write.c
index 75f764b43418..341d315d2a96 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -1383,6 +1383,13 @@ ssize_t generic_copy_file_range(struct file *file_in, loff_t pos_in,
 				struct file *file_out, loff_t pos_out,
 				size_t len, unsigned int flags)
 {
+	if (!(flags & COPY_FILE_SPLICE)) {
+		if (!file_out->f_op->copy_file_range)
+			return -EOPNOTSUPP;
+		else if (file_out->f_op->copy_file_range !=
+			 file_in->f_op->copy_file_range)
+			return -EXDEV;
+	}
 	return do_splice_direct(file_in, &pos_in, file_out, &pos_out,
 				len > MAX_RW_COUNT ? MAX_RW_COUNT : len, 0);
 }
@@ -1474,9 +1481,6 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in,
 {
 	ssize_t ret;
 
-	if (flags != 0)
-		return -EINVAL;
-
 	ret = generic_copy_file_checks(file_in, pos_in, file_out, pos_out, &len,
 				       flags);
 	if (unlikely(ret))



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Ceph Dev]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux