Re: nfs generic/373 failure after "fs: allow cross-vfsmount reflink/dedupe"

Josef Bacik <josef@xxxxxxxxxxxxxx> · Wed, 2 Mar 2022 19:29:34 -0500

On Wed, Mar 02, 2022 at 07:07:35PM -0500, J. Bruce Fields wrote:
> On Wed, Mar 02, 2022 at 06:45:12PM -0500, Josef Bacik wrote:
> > On Wed, Mar 02, 2022 at 05:42:50PM -0500, J. Bruce Fields wrote:
> > > On Wed, Mar 02, 2022 at 05:26:08PM -0500, Josef Bacik wrote:
> > > > On Wed, Mar 02, 2022 at 05:04:50PM -0500, J. Bruce Fields wrote:
> > > > > I started seeing generic/373 fail on recent linux-next in NFS testing.
> > > > > 
> > > > > Bisect lands it on aaf40970b1d0 "fs: allow cross-vfsmount
> > > > > reflink/dedupe".
> > > > > 
> > > > > The test fails because a clone between two mounts is expected to fail,
> > > > > and no longer does.
> > > > > 
> > > > > In my setup both mounts are nfs mounts.  They are mounts of different
> > > > > exports, and the exports are exports of different filesystems.  So it
> > > > > does make sense that the clone should fail.
> > > > > 
> > > > > I see the NFS client send a CLONE rpc to the server, and the server
> > > > > return success.  That seems wrong.
> > > > > 
> > > > > Both exported filesystems are xfs, and from the code it looks like the
> > > > > server calls vfs_clone_file_range(), which ends up calling
> > > > > xfs_file_remap_range().
> > > > > 
> > > > > Are we missing a check now in that xfs case?
> > > > > 
> > > > > I haven't looked any more closely at what's going on, so I could be
> > > > > missing something.
> > > > > 
> > > > 
> > > > Yeah there's a few fstests that test this functionality that need to be removed,
> > > > I have patches pending for this in our fstests staging tree (since we run
> > > > fstests nightly on our tree)
> > > > 
> > > > https://github.com/btrfs/fstests/tree/staging
> > > > 
> > > > Right now the patches just remove the tests from auto since that's what we run,
> > > > I'll remove them properly once the patch lands in linus.  Thanks,
> > > 
> > > So, out of curiosity, what is xfs doing in this case?  These are two
> > > filesystems on separate partitions, is it falling back on a read/write
> > > loop or something?
> > 
> > I don't think so?  I'm actually kind of confused, because nfsd does
> > vfs_clone_file_range, and the only place I messed with for CLONE was
> > ioctl_clone_file, so the patch changed literally nothing, unless you aren't
> > using nfsd for the server?
> > 
> > And if they are in fact two different file systems the i_sb != i_sb of the
> > files, so there's something pretty strange going on here, my patch shouldn't
> > affect your setup.  Thanks,
> 
> Sorry, took me a minute to understand, myself:
> 
> It's actually only the client behavior that changed.  Previously the
> client would reject an attempt to clone across filesystems, so the
> server never saw such a request.  After this patch, the client will go
> ahead and send the CLONE.  (Which, come to think of it, is probably the
> right thing for the client to do.)
> 
> So the server's probably always had a bug, and this just uncovered it.
> 
> I'd be curious what the consequences are.  And where the check should be
> (above or below vfs_clone_file_range()?).
> 

This is where I'm confused, this really shouldn't succeed

loff_t do_clone_file_range(struct file *file_in, loff_t pos_in,
                           struct file *file_out, loff_t pos_out,
                           loff_t len, unsigned int remap_flags)
{
        loff_t ret;

        WARN_ON_ONCE(remap_flags & REMAP_FILE_DEDUP);

        if (file_inode(file_in)->i_sb != file_inode(file_out)->i_sb)
                return -EXDEV;

loff_t vfs_clone_file_range(struct file *file_in, loff_t pos_in,
                            struct file *file_out, loff_t pos_out,
                            loff_t len, unsigned int remap_flags)
{
        loff_t ret;

        file_start_write(file_out);
        ret = do_clone_file_range(file_in, pos_in, file_out, pos_out, len,
                                  remap_flags);

And even if we get past here, I imagine XFS would freak out because it can't
find the extents (unless you're getting lucky and everything is lining up?).
I'm super confused...

Josef