Re: file handle in statx (was: Re: How to cope with subvolumes and snapshots on muti-user systems?)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Dec 13, 2023 at 07:48:16AM +1100, Dave Chinner wrote:
> On Tue, Dec 12, 2023 at 10:21:53AM -0500, Kent Overstreet wrote:
> > On Tue, Dec 12, 2023 at 04:53:28PM +1100, Dave Chinner wrote:
> > > Doesn't anyone else see or hear the elephant trumpeting loudly in
> > > the middle of the room?
> > > 
> > > I mean, we already have name_to_handle_at() for userspace to get a
> > > unique, opaque, filesystem defined file handle for any given file.
> > > It's the same filehandle that filesystems hand to the nfsd so nfs
> > > clients can uniquely identify the file they are asking the nfsd to
> > > operate on.
> > > 
> > > The contents of these filehandles is entirely defined by the file
> > > system and completely opaque to the user. The only thing that
> > > parses the internal contents of the handle is the filesystem itself.
> > > Therefore, as long as the fs encodes the information it needs into the
> > > handle to determine what subvol/snapshot the inode belongs to when
> > > the handle is passed back to it (e.g. from open_by_handle_at()) then
> > > nothing else needs to care how it is encoded.
> > > 
> > > So can someone please explain to me why we need to try to re-invent
> > > a generic filehandle concept in statx when we already have a
> > > have working and widely supported user API that provides exactly
> > > this functionality?
> > 
> > Definitely should be part of the discussion :)
> > 
> > But I think it _does_ need to be in statx; because:
> >  - we've determined that 64 bit ino_t just isn't a future proof
> >    interface, we're having real problems with it today
> >  - statx is _the_ standard, future proofed interface for getting inode
> >    attributes
> 
> No, it most definitely isn't, and statx was never intended as a
> dumping ground for anything and everything inode related. e.g. Any
> inode attribute that can be modified needs to use a different
> interface - one that has a corresponding "set" operation.

And here I thought the whole point of statx was to be an extensible way
to request any sort of inode attribute.

> >  - therefore, if we want userspace programmers to be using filehandles,
> >    instead of inode numbers, so there code isn't broken, we need to be
> >    providing interfaces that guide them in that direction.
> 
> We already have a filehandle interface they can use for this
> purpose. It is already used by some userspace applications for this
> purpose.
> 
> Anything new API function do with statx() will require application
> changes, and the vast majority of applications aren't using statx()
> directly - they are using stat() which glibc wraps to statx()
> internally. So they are going to need a change of API, anyway.
> 
> So, fundamentally, there is a change of API for most applications
> that need to do thorough inode uniqueness checks regardless of
> anything else. They can do this right now - just continue using
> stat() as they do right now, and then use name_to_filehandle_at()
> for uniqueness checks.
> 
> > Even assuming we can update all the documentation to say "filehandles
> > are the correct way to test inode uniqueness", you know at least half of
> > programmers will stick to stx_ino instead of the filehandle if the
> > filehandle is an extra syscall.
> 
> Your argument is "programmers suck so we must design for the
> lowest common denominator". That's an -awful- way to design APIs.

No, I'm saying if the old way doing things no longer works, we ought to
make the new future proofed way as ergonomic and easy to use as the old
way was - else it won't get used.

At the _very_ least we need to add a flag to statx for "inode number is
unreliable for uniqueness checks".

bcachefs could leave this off until the first snapshot has been taken.

But even with that option, I think we ought to be telling anyone doing
uniqueness checks to use the filehandle, because it includes i_generation.

> Further, this "programmers suck" design comes at a cost to every
> statx() call that does not need filehandles. That's the vast
> majority of statx() calls that are made on a system. Why should we
> slow down statx() for all users when so few applications actually
> need uniqueness and they can take the cost of robust uniqueness
> tests with an extra syscall entirely themselves?

For any local filesystem the filehandle is going to be the inode
generation number, the subvolume ID (if applicable), and the inode
number. That's 16 bytes on bcachefs, and if we make an attempt to
standardize how this works across filesystems we should be able to do it
without adding a new indirect function call to the statx() path. That
sounds pretty negligable to me.

The syscall overhead isn't going to be negligable when an application
has to do lots of scanning to find the files its interested - rsync.




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux