Re: file handle in statx (was: Re: How to cope with subvolumes and snapshots on muti-user systems?)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Dec 12, 2023 at 01:13:07PM +1100, NeilBrown wrote:
> On Tue, 12 Dec 2023, Kent Overstreet wrote:
> > On Tue, Dec 12, 2023 at 11:59:51AM +1100, NeilBrown wrote:
> > > On Tue, 12 Dec 2023, Kent Overstreet wrote:
> > > > NFSv4 specs that for the maximum size? That is pretty hefty...
> > > 
> > > It is - but it needs room to identify the filesystem and it needs to be
> > > stable across time.  That need is more than a local filesystem needs.
> > > 
> > > NFSv2 allowed 32 bytes which is enough for a 16 byte filesys uuid, 8
> > > byte inum and 8byte generation num.  But only just.
> > > 
> > > NFSv3 allowed 64 bytes which was likely plenty for (nearly?) every
> > > situation.
> > > 
> > > NFSv4 doubled it again because .... who knows.  "why not" I guess.
> > > Linux nfsd typically uses 20 or 28 bytes plus whatever the filesystem
> > > wants. (28 when the export point is not the root of the filesystem).
> > > I suspect this always fits within an NFSv3 handle except when
> > > re-exporting an NFS filesystem.  NFS re-export is an interesting case...
> > 
> > Now I'm really curious - i_generation wasn't enough? Are we including
> > filesystem UUIDs?
> 
> i_generation was invented so that it could be inserted into the NFS
> fileshandle.
> 
> The NFS filehandle is opaque.  It likely contains an inode number, a
> generation number, and a filesystem identifier.  But it is not possible
> to extract those from the handle.
> 
> > 
> > I suppose if we want to be able to round trip this stuff we do need to
> > allocate space for it, even if a local filesystem would never include
> > it.
> > 
> > > I suggest:
> > > 
> > >  STATX_ATTR_INUM_NOT_UNIQUE - it is possible that two files have the
> > >                               same inode number
> > > 
> > >  
> > >  __u64 stx_vol     Volume identifier.  Two files with same stx_vol and 
> > >                    stx_ino MUST be the same.  Exact meaning of volumes
> > >                    is filesys-specific
> > 
> > NFS reexport that you mentioned previously makes it seem like this
> > guarantee is impossible to provide in general (so I'd leave it out
> > entirely, it's just something for people to trip over).
> 
> NFS would not set stx_vol and would not return STATX_VOL in stx_mask.
> So it would not attempt to provide that guarantee.
> 
> Maybe we don't need to explicitly make this guarantee.
> 
> > 
> > But we definitely want stx_vol in there. Another thing that people ask
> > for is a way to ask "is this a subvolume root?" - we should make sure
> > that's clearly specified, or can we just include a bit for it?
> 
> The start way to test for a filesystem root - or mount point at least -
> is to stat the directory in question and its parent (..) and see if the
> have the same st_dev or not.

It depends. If you want to figure out whether it's a different
filesystem or a different btrfs subvolume then yes, this generally
works because of changing device ids. But it doesn't work for
bind-mounts as they don't change device numbers. But maybe you and I are
using mount point differently here.

> Applying the same logic to volumes means that a single stx_vol number is
> sufficient.

Yes, that would generally work.

> 
> I'm not strongly against a STATX_ATTR_VOL_ROOT flag providing everyone
> agrees what it means that we cannot imagine any awkward corner-cases
> (like a 'root' being different from a 'mount point').

I feel like you might have missed my previous mails where I strongly
argued for the addition of STATX_ATTR_SUBVOLUME_ROOT:

https://lore.kernel.org/linux-btrfs/20231108-herleiten-bezwangen-ffb2821f539e@brauner

The concept of a subvolume root and of a mount point should be kept
separate. Christoph tried mapping subvolumes to vfsmounts, something
that I (and Al) vehemently oppose for various reasons outlined in that
and other long threads.

I still think that we should start with exposing subvolume id first.




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux