Re: A Third perspective on BTRFS nfsd subvol dev/inode number issues.

"NeilBrown" <neilb@xxxxxxx> · Tue, 03 Aug 2021 08:36:44 +1000

On Tue, 03 Aug 2021, J. Bruce Fields wrote:
> On Tue, Aug 03, 2021 at 07:59:30AM +1000, NeilBrown wrote:
> > On Tue, 03 Aug 2021, J. Bruce Fields wrote:
> > > On Tue, Aug 03, 2021 at 07:10:44AM +1000, NeilBrown wrote:
> > > > On Mon, 02 Aug 2021, J. Bruce Fields wrote:
> > > > > On Mon, Aug 02, 2021 at 02:18:29PM +1000, NeilBrown wrote:
> > > > > > For btrfs, the "location" is root.objectid ++ file.objectid.  I think
> > > > > > the inode should become (file.objectid ^ swab64(root.objectid)).  This
> > > > > > will provide numbers that are unique until you get very large subvols,
> > > > > > and very many subvols.
> > > > > 
> > > > > If you snapshot a filesystem, I'd expect, at least by default, that
> > > > > inodes in the snapshot to stay the same as in the snapshotted
> > > > > filesystem.
> > > > 
> > > > As I said: we need to challenge and revise user-space (and meat-space)
> > > > expectations. 
> > > 
> > > The example that came to mind is people that export a snapshot, then
> > > replace it with an updated snapshot, and expect that to be transparent
> > > to clients.
> > > 
> > > Our client will error out with ESTALE if it notices an inode number
> > > changed out from under it.
> > 
> > Will it?
> 
> See fs/nfs/inode.c:nfs_check_inode_attributes():
> 
> 	if (nfsi->fileid != fattr->fileid) {
>                 /* Is this perhaps the mounted-on fileid? */
>                 if ((fattr->valid & NFS_ATTR_FATTR_MOUNTED_ON_FILEID) &&
>                     nfsi->fileid == fattr->mounted_on_fileid)
>                         return 0;
>                 return -ESTALE;
>         }

That code fires if the fileid (inode number) reported for a particular
filehandle changes.  I'm saying that won't happen.

If you reflink (aka snaphot) a btrfs subtree (aka "subvol"), then the
new sub-tree will ALREADY have different filehandles than the original
subvol.  Whether it has the same inode numbers or different ones is
irrelevant to NFS.

(on reflection, I didn't say that as clearly as I could have done last time)

NeilBrown