On Tue, Mar 11, 2025 at 03:39:32PM -0400, Chuck Lever wrote: > It's not possible to guarantee that the next entry will have a higher > offset value. > > Suppose the "New offset" value wraps. So the current directory entry > will have a offset that is close to U32_MAX, but the next created > directory entry will have an offset close to zero. In fact, new entries > will have a smaller offset value than "current" for quite some time. In fact even for on-disk file systems (including XFS) it often has a lower value - most file systems try to fill holes in the d_off space created by previously deleted entries. The big exception is btrfs, which just uses a monotonically increasing 64-bit counter (which can create problems fairly quickly on 32-bit systems, as the seekdir/telldir cookie is a long and not a off_t and thus 32-bit on all 32-bit systems). > The offset is a cookie, not a numeric value. It is simply something that > says "please start here when iteration continues". Yes. This then places into the next mine field about reporting entries added between getdents iterations. Which can cause all kinds of issues when done wrong especially for rename()d entries. > Think of it as a > hash -- it looks like a hexadecimal number, but has no other intrisic > meaning. (In fact, I think some Linux file systems do use a hash here > rather than a scalar integer). A hash is actually kinda dangerous because it can trivially place multiple values at the same offset with hash collision. And given the hashes use for directories it isn't that hard to introduce them intentionally for many file systems.