On Thu, 2024-10-17 at 13:59 -0400, Jeff Layton wrote: > On Thu, 2024-10-17 at 17:09 +0000, Trond Myklebust wrote: > > On Thu, 2024-10-17 at 13:05 -0400, Jeff Layton wrote: > > > On Thu, 2024-10-17 at 11:15 -0400, Paul Moore wrote: > > > > On Thu, Oct 17, 2024 at 10:58 AM Christoph Hellwig > > > > <hch@xxxxxxxxxxxxx> wrote: > > > > > On Thu, Oct 17, 2024 at 10:54:12AM -0400, Paul Moore wrote: > > > > > > Okay, good to know, but I was hoping that there we could > > > > > > come > > > > > > up with > > > > > > an explicit list of filesystems that maintain their own > > > > > > private > > > > > > inode > > > > > > numbers outside of inode-i_ino. > > > > > > > > > > Anything using iget5_locked is a good start. Add to that > > > > > file > > > > > systems > > > > > implementing their own inode cache (at least xfs and > > > > > bcachefs). > > > > > > > > Also good to know, thanks. However, at this point the lack of > > > > a > > > > clear > > > > answer is making me wonder a bit more about inode numbers in > > > > the > > > > view > > > > of VFS developers; do you folks care about inode numbers? I'm > > > > not > > > > asking to start an argument, it's a genuine question so I can > > > > get a > > > > better understanding about the durability and sustainability of > > > > inode->i_no. If all of you (the VFS folks) aren't concerned > > > > about > > > > inode numbers, I suspect we are going to have similar issues in > > > > the > > > > future and we (the LSM folks) likely need to move away from > > > > reporting > > > > inode numbers as they aren't reliably maintained by the VFS > > > > layer. > > > > > > > > > > Like Christoph said, the kernel doesn't care much about inode > > > numbers. > > > > > > People care about them though, and sometimes we have things in > > > the > > > kernel that report them in some fashion (tracepoints, procfiles, > > > audit > > > events, etc.). Having those match what the userland stat() st_ino > > > field > > > tells you is ideal, and for the most part that's the way it > > > works. > > > > > > The main exception is when people use 32-bit interfaces (somewhat > > > rare > > > these days), or they have a 32-bit kernel with a filesystem that > > > has > > > a > > > 64-bit inode number space (NFS being one of those). The NFS > > > client > > > has > > > basically hacked around this for years by tracking its own fileid > > > field > > > in its inode. That's really a waste though. That could be > > > converted > > > over to use i_ino instead if it were always wide enough. > > > > > > It'd be better to stop with these sort of hacks and just fix this > > > the > > > right way once and for all, by making i_ino 64 bits everywhere. > > > > Nope. > > > > That won't fix glibc, which is the main problem NFS has to work > > around. > > > > True, but that's really a separate problem. Currently, the problem where the kernel needs to use one inode number in iget5() and a different one when replying to stat() is limited to the set of 64-bit kernels that can operate in 32-bit userland compability mode. So mainly on x86_64 kernels that are set up to run in i386 userland compatibility mode. If you now decree that all kernels will use 64-bit inode numbers internally, then you've suddenly expanded the problem to encompass all the remaining 32-bit kernels. In order to avoid stat() returning EOVERFLOW to the applications, they too will have to start generating separate 32-bit inode numbers. > > It also doesn't inform how we track inode numbers inside the kernel. > Inode numbers have been 64 bits for years on "real" filesystems. If > we > were designing this today, i_ino would be a u64, and we'd only hash > that down to 32 bits when necessary. "I'm doing a (free) operating system (just a hobby, won't be big and professional like gnu) for 386(486) AT clones." History is a bitch... -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@xxxxxxxxxxxxxxx