Re: file handle in statx (was: Re: How to cope with subvolumes and snapshots on muti-user systems?)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 12 Dec 2023, Kent Overstreet wrote:
> On Tue, Dec 12, 2023 at 11:59:51AM +1100, NeilBrown wrote:
> > On Tue, 12 Dec 2023, Kent Overstreet wrote:
> > > NFSv4 specs that for the maximum size? That is pretty hefty...
> > 
> > It is - but it needs room to identify the filesystem and it needs to be
> > stable across time.  That need is more than a local filesystem needs.
> > 
> > NFSv2 allowed 32 bytes which is enough for a 16 byte filesys uuid, 8
> > byte inum and 8byte generation num.  But only just.
> > 
> > NFSv3 allowed 64 bytes which was likely plenty for (nearly?) every
> > situation.
> > 
> > NFSv4 doubled it again because .... who knows.  "why not" I guess.
> > Linux nfsd typically uses 20 or 28 bytes plus whatever the filesystem
> > wants. (28 when the export point is not the root of the filesystem).
> > I suspect this always fits within an NFSv3 handle except when
> > re-exporting an NFS filesystem.  NFS re-export is an interesting case...
> 
> Now I'm really curious - i_generation wasn't enough? Are we including
> filesystem UUIDs?

i_generation was invented so that it could be inserted into the NFS
fileshandle.

The NFS filehandle is opaque.  It likely contains an inode number, a
generation number, and a filesystem identifier.  But it is not possible
to extract those from the handle.

> 
> I suppose if we want to be able to round trip this stuff we do need to
> allocate space for it, even if a local filesystem would never include
> it.
> 
> > I suggest:
> > 
> >  STATX_ATTR_INUM_NOT_UNIQUE - it is possible that two files have the
> >                               same inode number
> > 
> >  
> >  __u64 stx_vol     Volume identifier.  Two files with same stx_vol and 
> >                    stx_ino MUST be the same.  Exact meaning of volumes
> >                    is filesys-specific
> 
> NFS reexport that you mentioned previously makes it seem like this
> guarantee is impossible to provide in general (so I'd leave it out
> entirely, it's just something for people to trip over).

NFS would not set stx_vol and would not return STATX_VOL in stx_mask.
So it would not attempt to provide that guarantee.

Maybe we don't need to explicitly make this guarantee.

> 
> But we definitely want stx_vol in there. Another thing that people ask
> for is a way to ask "is this a subvolume root?" - we should make sure
> that's clearly specified, or can we just include a bit for it?

The start way to test for a filesystem root - or mount point at least -
is to stat the directory in question and its parent (..) and see if the
have the same st_dev or not.
Applying the same logic to volumes means that a single stx_vol number is
sufficient.

I'm not strongly against a STATX_ATTR_VOL_ROOT flag providing everyone
agrees what it means that we cannot imagine any awkward corner-cases
(like a 'root' being different from a 'mount point').

> 
> >  STATX_VOL         Want stx_vol
> > 
> >   __u8 stx_handle_len  Length of stx_handle if present
> >   __u8 stx_handle[128] Unique stable identifier for this file.  Will
> >                        NEVER be reused for a different file.
> >                        This appears AFTER __statx_pad2, beyond
> >                        the current 'struct statx'.
> >  STATX_HANDLE      Want stx_handle_len and stx_handle. Buffer for
> >                    receiving statx info has at least
> >                    sizeof(struct statx)+128 bytes.
> > 
> > I think both the handle and the vol can be useful.
> > NFS can provide stx_handle but not stx_vol.  It is the thing
> > to use for equality testing, but it is only needed if
> > STATX_ATTR_INUM_NOT_UNIQUE is set.
> > stx_vol is useful for "du -x" or maybe "du --one-volume" or similar.
> > 
> > 
> > Note that we *could* add stx_vol to NFSv4.2.  It is designed for
> > incremental extension.  I suspect we wouldn't want to rush into this,
> > but to wait to see if different volume-capable filesystems have other
> > details of volumes that are common and can usefully be exported by statx
> 
> Sounds reasonable
> 

Thanks,
NeilBrown




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux