On Mon, Feb 15, 2021 at 8:57 PM Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> wrote: > > On Mon, 2021-02-15 at 19:24 +0200, Amir Goldstein wrote: > > On Mon, Feb 15, 2021 at 6:53 PM Trond Myklebust < > > trondmy@xxxxxxxxxxxxxxx> wrote: > > > > > > On Mon, 2021-02-15 at 18:34 +0200, Amir Goldstein wrote: > > > > On Mon, Feb 15, 2021 at 5:42 PM Luis Henriques < > > > > lhenriques@xxxxxxx> > > > > wrote: > > > > > > > > > > Nicolas Boichat reported an issue when trying to use the > > > > > copy_file_range > > > > > syscall on a tracefs file. It failed silently because the file > > > > > content is > > > > > generated on-the-fly (reporting a size of zero) and > > > > > copy_file_range > > > > > needs > > > > > to know in advance how much data is present. > > > > > > > > > > This commit restores the cross-fs restrictions that existed > > > > > prior > > > > > to > > > > > 5dae222a5ff0 ("vfs: allow copy_file_range to copy across > > > > > devices") > > > > > and > > > > > removes generic_copy_file_range() calls from ceph, cifs, fuse, > > > > > and > > > > > nfs. > > > > > > > > > > Fixes: 5dae222a5ff0 ("vfs: allow copy_file_range to copy across > > > > > devices") > > > > > Link: > > > > > https://lore.kernel.org/linux-fsdevel/20210212044405.4120619-1-drinkcat@xxxxxxxxxxxx/ > > > > > Cc: Nicolas Boichat <drinkcat@xxxxxxxxxxxx> > > > > > Signed-off-by: Luis Henriques <lhenriques@xxxxxxx> > > > > > > > > Code looks ok. > > > > You may add: > > > > > > > > Reviewed-by: Amir Goldstein <amir73il@xxxxxxxxx> > > > > > > > > I agree with Trond that the first paragraph of the commit message > > > > could > > > > be improved. > > > > The purpose of this change is to fix the change of behavior that > > > > caused the regression. > > > > > > > > Before v5.3, behavior was -EXDEV and userspace could fallback to > > > > read. > > > > After v5.3, behavior is zero size copy. > > > > > > > > It does not matter so much what makes sense for CFR to do in this > > > > case (generic cross-fs copy). What matters is that nobody asked > > > > for > > > > this change and that it caused problems. > > > > > > > > > > No. I'm saying that this patch should be NACKed unless there is a > > > real > > > explanation for why we give crap about this tracefs corner case and > > > why > > > it can't be fixed. > > > > > > There are plenty of reasons why copy offload across filesystems > > > makes > > > sense, and particularly when you're doing NAS. Clone just doesn't > > > cut > > > it when it comes to disaster recovery (whereas backup to a > > > different > > > storage unit does). If the client has to do the copy, then you're > > > effectively doubling the load on the server, and you're adding > > > potentially unnecessary network traffic (or at the very least you > > > are > > > doubling that traffic). > > > > > > > I don't understand the use case you are describing. > > > > Which filesystem types are you talking about for source and target > > of copy_file_range()? > > > > To be clear, the original change was done to support NFS/CIFS server- > > side > > copy and those should not be affected by this change. > > > > That is incorrect: > > ssize_t nfsd_copy_file_range(struct file *src, u64 src_pos, struct file > *dst, > u64 dst_pos, u64 count) > { > > /* > * Limit copy to 4MB to prevent indefinitely blocking an nfsd > * thread and client rpc slot. The choice of 4MB is somewhat > * arbitrary. We might instead base this on r/wsize, or make it > * tunable, or use a time instead of a byte limit, or implement > * asynchronous copy. In theory a client could also recognize a > * limit like this and pipeline multiple COPY requests. > */ > count = min_t(u64, count, 1 << 22); > return vfs_copy_file_range(src, src_pos, dst, dst_pos, count, 0); > } > > You are now explicitly changing the behaviour of knfsd when the source > and destination filesystem differ. > > For one thing, you are disallowing the NFSv4.2 copy offload use case of > copying from a local filesystem to a remote NFS server. However you are > also disallowing the copy from, say, an XFS formatted partition to an > ext4 partition. > Got it. This is easy to solve with a flag COPY_FILE_SPLICE (or something) that is internal to kernel users. FWIW, you may want to look at the loop in ovl_copy_up_data() for improvements to nfsd_copy_file_range(). We can move the check out to copy_file_range syscall: if (flags != 0) return -EINVAL; Leave the fallback from all filesystems and check for the COPY_FILE_SPLICE flag inside generic_copy_file_range(). Thanks, Amir.