Re: A Third perspective on BTRFS nfsd subvol dev/inode number issues.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 02 Aug 2021, Martin Steigerwald wrote:
> Hi Neil!
> 
> Wow, this is a bit overwhelming for me. However, I got a very specific 
> question for userspace developers in order to probably provide valuable 
> input to the KDE Baloo desktop search developers:
> 
> NeilBrown - 02.08.21, 06:18:29 CEST:
> > The "obvious" choice for a replacement is the file handle provided by
> > name_to_handle_at() (falling back to st_ino if name_to_handle_at isn't
> > supported by the filesystem).  This returns an extensible opaque
> > byte-array.  It is *already* more reliable than st_ino.  Comparing
> > st_ino is only a reliable way to check if two files are the same if
> > you have both of them open.  If you don't, then one of the files
> > might have been deleted and the inode number reused for the other.  A
> > filehandle contains a generation number which protects against this.
> > 
> > So I think we need to strongly encourage user-space to start using
> > name_to_handle_at() whenever there is a need to test if two things are
> > the same.
> 
> How could that work for Baloo's use case to see whether a file it 
> encounters is already in its database or whether it is a new file.
> 
> Would Baloo compare the whole file handle or just certain fields or make a 
> hash of the filehandle or what ever? Could you, in pseudo code or 
> something, describe the approach you'd suggest. I'd then share it on:

Yes, the whole filehandle.

 struct file_handle {
        unsigned int handle_bytes; /* Size of f_handle [in, out] */
        int           handle_type;    /* Handle type [out] */
        unsigned char f_handle[0]; /* File identifier (sized by
                                     caller) [out] */
 };

i.e.  compare handle_type, handle_bytes, and handle_bytes worth of
f_handle.
This file_handle is local to the filesytem.  Two different filesystems
can use the same filehandle for different files.  So the identity of the
filesystem need to be combined with the file_handle.

> 
> Bug 438434 - Baloo appears to be indexing twice the number of files than 
> are actually in my home directory
> 
> https://bugs.kde.org/438434

This bug wouldn't be address by using the filehandle.  Using a
filehandle allows you to compare two files within a single filesystem.
This bug is about comparing two filesystems either side of a reboot, to
see if they are the same.

As has already been mentioned in that bug, statfs().f_fsid is the best
solution (unless comparing the mount point is satisfactory).

NeilBrown



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux