On Mon, Sep 26, 2016 at 7:33 PM, Darrick J. Wong <darrick.wong@xxxxxxxxxx> wrote: > On Fri, Sep 23, 2016 at 09:52:42PM +0300, Amir Goldstein wrote: >> On Fri, Sep 23, 2016 at 7:13 PM, Darrick J. Wong >> <darrick.wong@xxxxxxxxxx> wrote: >> > On Fri, Sep 23, 2016 at 10:57:56AM +0300, Amir Goldstein wrote: >> >> On Wed, Sep 14, 2016 at 3:43 PM, Amir Goldstein <amir73il@xxxxxxxxx> wrote: >> >> > copy_file_range syscall returns -EXDEV if src and dest >> >> > file are not on the same file system. >> >> > The vfs_copy_file_range() helper, however, knows how to copy >> >> > across file systems with do_splice_direct(). >> >> > >> >> > Move the enforcement of same file system from the vfs helper >> >> > to the syscall code. >> >> > >> >> > A following patch is going to use the vfs_copy_file_range() >> >> > helper in overlayfs to copy up between lower and upper >> >> > not on the same file system. >> >> > >> >> > Signed-off-by: Amir Goldstein <amir73il@xxxxxxxxx> >> >> > --- >> >> > fs/read_write.c | 16 +++++++++++----- >> >> > 1 file changed, 11 insertions(+), 5 deletions(-) >> >> > >> >> > diff --git a/fs/read_write.c b/fs/read_write.c >> >> > index 9dc6e52..6975fe8 100644 >> >> > --- a/fs/read_write.c >> >> > +++ b/fs/read_write.c >> >> > @@ -1502,10 +1502,6 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in, >> >> > (file_out->f_flags & O_APPEND)) >> >> > return -EBADF; >> >> > >> >> > - /* this could be relaxed once a method supports cross-fs copies */ >> >> > - if (inode_in->i_sb != inode_out->i_sb) >> >> > - return -EXDEV; >> >> > - >> >> > if (len == 0) >> >> > return 0; >> >> > >> >> > @@ -1514,7 +1510,9 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in, >> >> > return ret; >> >> > >> >> > ret = -EOPNOTSUPP; >> >> > - if (file_out->f_op->copy_file_range) >> >> > + /* copy_file_range() method does not support cross-fs copies */ >> >> > + if (inode_in->i_sb == inode_out->i_sb && >> >> > + file_out->f_op->copy_file_range) >> >> > ret = file_out->f_op->copy_file_range(file_in, pos_in, file_out, >> >> > pos_out, len, flags); >> >> > if (ret == -EOPNOTSUPP) >> >> > @@ -1569,6 +1567,14 @@ SYSCALL_DEFINE6(copy_file_range, int, fd_in, loff_t __user *, off_in, >> >> > pos_out = f_out.file->f_pos; >> >> > } >> >> > >> >> > + /* >> >> > + * vfs_copy_file_range() can do cross-fs copy, but we want to >> >> > + * fulfill the guaranty to userland that copy_file_range syscall >> >> > + * does not allow cross-fs copy >> >> > + */ >> >> > + if (file_inode(f_in.file)->i_sb != file_inode(f_out.file)->i_sb) >> >> > + return -EXDEV; >> >> >> >> Oops, that was supposed to be goto out; >> >> Anyway, I am holding back on the vfs_copy_file_range() patches sub set >> >> until I have a reliable test on xfs to fall back from clone to copy range >> > >> > Ok, attached are two rough patches -- one to add the error injection point >> > into the kernel, and a second one to add it to the xfs_io 'inject' command. >> > Note that you'll have to format the XFS filesystem with rmapbt=1 since we >> > can't otherwise avoid per-AG ENOSPC if rmap is enabled. >> > >> > The relevant xfstests commands are: >> > >> > _require_xfs_io_error_injection "ag_resv_critical" >> > _scratch_inject_error "ag_resv_critical" >> > >> > See the xfs/325 test for a rough framework. I'll work on cleaning up the >> > patches and trying to get them into 4.9. >> > >> >> Thanks, Darrick, but I'm not sure that's enough. does the framework allow >> to inject an error for a specific AG? otherwise, the code will not >> fall back from >> failing full reflink to partial copy partial reflink. > > The error injector (as far as I know) cannot inject errors only for a > specific AG. However, since your goal is to have the reflink partially > succeed, I could set up the injector to fail only a fraction of the > time. The difficulty here is that what we /probably/ need is to have it > ENOSPC after N extents, where 0 < N <= nr_file_extents. Tricky since > the probabilistic nature means that it could inject during the first > XFS_TEST_ERROR call. > > Or change the function such that the error injector function only gets > called if the file offset > 0. Then you can prepare a specially crafted > file such that you'll always get at least a partial reflink before > ENOSPC. Note that the reflink functions won't return where an error > happened, so you'll end up recopying the entire range regardless. > All right. How about the realtime AG. Does it support reflink? Can a file have extents both in realtime AG and other AGs? -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html