On Sat, Dec 21, 2019 at 10:43:05AM +0200, Amir Goldstein wrote: > On Fri, Dec 20, 2019 at 11:33 PM Darrick J. Wong > <darrick.wong@xxxxxxxxxx> wrote: > > > > On Fri, Dec 20, 2019 at 02:49:36AM +0000, Chris Down wrote: > > > In Facebook production we are seeing heavy inode number wraparounds on > > > tmpfs. On affected tiers, in excess of 10% of hosts show multiple files > > > with different content and the same inode number, with some servers even > > > having as many as 150 duplicated inode numbers with differing file > > > content. > > > > > > This causes actual, tangible problems in production. For example, we > > > have complaints from those working on remote caches that their > > > application is reporting cache corruptions because it uses (device, > > > inodenum) to establish the identity of a particular cache object, but > > > > ...but you cannot delete the (dev, inum) tuple from the cache index when > > you remove a cache object?? > > > > > because it's not unique any more, the application refuses to continue > > > and reports cache corruption. Even worse, sometimes applications may not > > > even detect the corruption but may continue anyway, causing phantom and > > > hard to debug behaviour. > > > > > > In general, userspace applications expect that (device, inodenum) should > > > be enough to be uniquely point to one inode, which seems fair enough. > > > > Except that it's not. (dev, inum, generation) uniquely points to an > > instance of an inode from creation to the last unlink. > > > > Yes, but also: > There should not exist two live inodes on the system with the same (dev, inum) > The problem is that ino 1 may still be alive when wraparound happens > and then two different inodes with ino 1 exist on same dev. *OH* that's different then. Most sane filesystems <cough>btrfs<cough> should never have the same inode numbers for different files. Sorry for the noise, I misunderstood what the issue was. :) > Take the 'diff' utility for example, it will report that those files > are identical > if they have the same dev,ino,size,mtime. I suspect that 'mv' will not > let you move one over the other, assuming they are hardlinks. > generation is not even exposed to legacy application using stat(2). Yeah, I was surprised to see it's not even in statx. :/ --D > Thanks, > Amir.