Re: Testing if two open descriptors refer to the same inode

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2024-07-29, Mateusz Guzik <mjguzik@xxxxxxxxx> wrote:
> On Mon, Jul 29, 2024 at 12:57 PM Florian Weimer <fweimer@xxxxxxxxxx> wrote:
> >
> > * Mateusz Guzik:
> >
> > > On Mon, Jul 29, 2024 at 12:40:35PM +0200, Florian Weimer wrote:
> > >> * Mateusz Guzik:
> > >>
> > >> > On Mon, Jul 29, 2024 at 08:55:46AM +0200, Florian Weimer wrote:
> > >> >> It was pointed out to me that inode numbers on Linux are no longer
> > >> >> expected to be unique per file system, even for local file systems.
> > >> >
> > >> > I don't know if I'm parsing this correctly.
> > >> >
> > >> > Are you claiming on-disk inode numbers are not guaranteed unique per
> > >> > filesystem? It sounds like utter breakage, with capital 'f'.
> > >>
> > >> Yes, POSIX semantics and traditional Linux semantics for POSIX-like
> > >> local file systems are different.
> > >
> > > Can you link me some threads about this?
> >
> > Sorry, it was an internal thread.  It's supposed to be common knowledge
> > among Linux file system developers.  Aleksa referenced LSF/MM
> > discussions.
> >
> 
> So much for open development :-P
> 
> > > I had this in mind (untested modulo compilation):
> > >
> > > diff --git a/fs/fcntl.c b/fs/fcntl.c
> > > index 300e5d9ad913..5723c3e82eac 100644
> > > --- a/fs/fcntl.c
> > > +++ b/fs/fcntl.c
> > > @@ -343,6 +343,13 @@ static long f_dupfd_query(int fd, struct file *filp)
> > >       return f.file == filp;
> > >  }
> > >
> > > +static long f_dupfd_query_inode(int fd, struct file *filp)
> > > +{
> > > +     CLASS(fd_raw, f)(fd);
> > > +
> > > +     return f.file->f_inode == filp->f_inode;
> > > +}
> > > +
> > >  static long do_fcntl(int fd, unsigned int cmd, unsigned long arg,
> > >               struct file *filp)
> > >  {
> > > @@ -361,6 +368,9 @@ static long do_fcntl(int fd, unsigned int cmd, unsigned long arg,
> > >       case F_DUPFD_QUERY:
> > >               err = f_dupfd_query(argi, filp);
> > >               break;
> > > +     case F_DUPFD_QUERY_INODE:
> > > +             err = f_dupfd_query_inode(argi, filp);
> > > +             break;
> > >       case F_GETFD:
> > >               err = get_close_on_exec(fd) ? FD_CLOEXEC : 0;
> > >               break;
> > > diff --git a/include/uapi/linux/fcntl.h b/include/uapi/linux/fcntl.h
> > > index c0bcc185fa48..2e93dbdd8fd2 100644
> > > --- a/include/uapi/linux/fcntl.h
> > > +++ b/include/uapi/linux/fcntl.h
> > > @@ -16,6 +16,8 @@
> > >
> > >  #define F_DUPFD_QUERY        (F_LINUX_SPECIFIC_BASE + 3)
> > >
> > > +#define F_DUPFD_QUERY_INODE (F_LINUX_SPECIFIC_BASE + 4)
> > > +
> > >  /*
> > >   * Cancel a blocking posix lock; internal use only until we expose an
> > >   * asynchronous lock api to userspace:
> >
> > It's certainly much easier to use than name_to_handle_at, so it looks
> > like a useful option to have.
> >
> > Could we return a three-way comparison result for sorting?  Or would
> > that expose too much about kernel pointer values?
> >
> 
> As is this would sort by inode *address* which I don't believe is of
> any use -- the order has to be assumed arbitrary.
> 
> Perhaps there is something which is reliably the same and can be
> combined with something else to be unique system-wide (the magic
> handle thing?).
> 
> But even then you would need to justify trying to sort by fcntl calls,
> which sounds pretty dodgey to me.

Programs need to key things by (dev, ino) currently, so you need to be
able to get some kind of ordinal that you can sort with.

If we really want to make an interface to let you do this without
exposing hashes in statx, then kcmp(2) makes more sense, but having to
keep a file descriptor for each entry in a hashtable would obviously
cause -EMFILE issues.

> Given that thing I *suspect* statx() may want to get extended with
> some guaranteed unique identifier. Then you can sort in userspace all
> you want.

Yeah, this is what the hashed fhandle patch I have does.

> Based on your opening mail I assumed you only need to check 2 files,
> for which the proposed fcntl does the trick.
> 
> Or to put it differently: there seems to be more to the picture than
> in the opening mail, so perhaps you could outline what you are looking
> for.

Hardlink detection requires creating a hashmap of (dev, ino) to find
hardlinks. Pair-wise checking is not sufficient for that usecase (which
AFAIK is the most common thing people use inode numbers for -- it's at
least probably the most common thing people run in practice since
archive tools do this.)

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux