Re: [LSF TOPIC] statx extensions for subvol/snapshot filesystems & more

Josef Bacik <josef@xxxxxxxxxxxxxx> · Wed, 21 Feb 2024 16:08:11 -0500

On Wed, Feb 21, 2024 at 04:06:34PM +0100, Miklos Szeredi wrote:
> On Wed, 21 Feb 2024 at 01:51, Kent Overstreet <kent.overstreet@xxxxxxxxx> wrote:
> >
> > Recently we had a pretty long discussion on statx extensions, which
> > eventually got a bit offtopic but nevertheless hashed out all the major
> > issues.
> >
> > To summarize:
> >  - guaranteeing inode number uniqueness is becoming increasingly
> >    infeasible, we need a bit to tell userspace "inode number is not
> >    unique, use filehandle instead"
> 
> This is a tough one.   POSIX says "The st_ino and st_dev fields taken
> together uniquely identify the file within the system."
> 

Which is what btrfs has done forever, and we've gotten yelled at forever for
doing it.  We have a compromise and a way forward, but it's not a widely held
view that changing st_dev to give uniqueness is an acceptable solution.  It may
have been for overlayfs because you guys are already doing something special,
but it's not an option that is afforded the rest of us.

> Adding a bit that says "from now the above POSIX rule is invalid"
> doesn't instantly fix all the existing applications that rely on it.
> 
> Linux did manage to extend st_ino from 32 to 64 bits, but even in that
> case it's not clear how many instances of
> 
>     stat(path1, &st);
>     unsigned int ino = st.st_ino;
>     stat(path2, &st);
>     if (ino == st.st_ino)
>         ...
> 
> are waiting to blow up one fine day.  Of course the code should have
> used ino_t, but I think this pattern is not that uncommon.
> 
> All in all, I don't think adding a flag to statx is the right answer.
> It entitles filesystem developers to be sloppy about st_ino
> uniqueness, which is not a good idea.   I think what overlayfs is
> doing (see documentation) is generally the right direction.  It makes
> various compromises but not to uniqueness, and we haven't had
> complaints (fingers crossed).

Again, you haven't, I have, consistently and constantly for a decade.

> Nudging userspace developers to use file handles would also be good,
> but they should do so unconditionally, not based on a flag that has no
> well defined meaning.

I think that's what we're trying to do, define it properly.  We now have 2 file
systems in tree that have this sort of behavior.  It's not a new or crazy thing
(well I suppose it is when you consider the lifetime of file systems), having a
way for user space developers that care to properly identify they've wandered
across a subvolume boundary could be useful.

As for the proposal itself, we talk about this every year.  We're all more or
less onboard with the idea, the code just needs to be written.  Write the code
and post the patches, I assume that there won't be much pushback, probably could
even get it into Christian's tree in some branch or another before LSF.  Thanks,

Josef