file handle in statx (was: Re: How to cope with subvolumes and snapshots on muti-user systems?)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Dec 12, 2023 at 09:43:27AM +1100, NeilBrown wrote:
> On Sat, 09 Dec 2023, Kent Overstreet wrote:
> > On Fri, Dec 08, 2023 at 12:34:28PM +0100, Donald Buczek wrote:
> > > On 12/8/23 03:49, Kent Overstreet wrote:
> > > 
> > > > We really only need 6 or 7 bits out of the inode number for sharding;
> > > > then 20-32 bits (nobody's going to have a billion snapshots; a million
> > > > is a more reasonable upper bound) for the subvolume ID leaves 30 to 40
> > > > bits for actually allocating inodes out of.
> > > > 
> > > > That'll be enough for the vast, vast majority of users, but exceeding
> > > > that limit is already something we're technically capable of: we're
> > > > currently seeing filesystems well over 100 TB, petabyte range expected
> > > > as fsck gets more optimized and online fsck comes.
> > > 
> > > 30 bits would not be enough even today:
> > > 
> > > buczek@done:~$ df -i /amd/done/C/C8024
> > > Filesystem         Inodes     IUsed      IFree IUse% Mounted on
> > > /dev/md0       2187890304 618857441 1569032863   29% /amd/done/C/C8024
> > > 
> > > So that's 32 bit on a random production system ( 618857441 == 0x24e303e1 ).
> 
> only 30 bits though.  So it is a long way before you use all 32 bits.
> How many volumes do you have?
> 
> > > 
> > > And if the idea to produce unique inode numbers by hashing the filehandle into 64 is followed, collisions definitely need to be addressed. With 618857441 objects, the probability of a hash collision with 64 bit is already over 1% [1].
> > 
> > Oof, thanks for the data point. Yeah, 64 bits is clearly not enough for
> > a unique identifier; time to start looking at how to extend statx.
> > 
> 
> 64 should be plenty...
> 
> If you have 32 bits for free allocation, and 7 bits for sharding across
> 128 CPUs, then you can allocate many more than 4 billion inodes.  Maybe
> not the full 500 billion for 39 bits, but if you actually spread the
> load over all the shards, then certainly tens of billions.
> 
> If you use 22 bits for volume number and 42 bits for inodes in a volume,
> then you can spend 7 on sharding and still have room for 55 of Donald's
> filesystems to be allocated by each CPU.
> 
> And if Donald only needs thousands of volumes, not millions, then he
> could configure for a whole lot more headroom.
> 
> In fact, if you use the 64 bits of vfs_inode number by filling in bits from
> the fs-inode number from one end, and bits from the volume number from
> the other end, then you don't need to pre-configure how the 64 bits are
> shared.
> You record inum-bits and volnum bits in the filesystem metadata, and
> increase either as needed.  Once the sum hits 64, you start returning
> ENOSPC for new files or new volumes.
> 
> There will come a day when 64 bits is not enough for inodes in a single
> filesystem.  Today is not that day.

Except filesystems are growing all the time: that leaves almost no room
for growth and then we're back in the world where users had to guess how
many inodes they were going to need in their filesystem; and if we put
this off now we're just kicking the can down the road until when it
becomes really pressing and urgent to solve.

No, we need to come up with something better.

I was chatting a bit with David Howells on IRC about this, and floated
adding the file handle to statx. It looks like there's enough space
reserved to make this feasible - probably going with a fixed maximum
size of 128-256 bits.

Thoughts?




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux