On Fri, Dec 29, 2023 at 07:44:20PM +0200, Amir Goldstein wrote: > On Fri, Dec 29, 2023 at 4:35 PM Chuck Lever <chuck.lever@xxxxxxxxxx> wrote: > > > > On Fri, Dec 29, 2023 at 07:46:54AM +0200, Amir Goldstein wrote: > > > [CC: fsdevel, viro] > > > > Thanks for picking this up, Amir, and for copying viro/fsdevel. I > > was planning to repost this next week when more folks are back, but > > this works too. > > > > Trond, if you'd like, I can handle review changes if you don't have > > time to follow up. > > > > > > > On Thu, Dec 28, 2023 at 10:22 PM <trondmy@xxxxxxxxxx> wrote: > > > > > > > > From: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx> > > > > > > > > The fallback implementation for the get_name export operation uses > > > > readdir() to try to match the inode number to a filename. That filename > > > > is then used together with lookup_one() to produce a dentry. > > > > A problem arises when we match the '.' or '..' entries, since that > > > > causes lookup_one() to fail. This has sometimes been seen to occur for > > > > filesystems that violate POSIX requirements around uniqueness of inode > > > > numbers, something that is common for snapshot directories. > > > > > > Ouch. Nasty. > > > > > > Looks to me like the root cause is "filesystems that violate POSIX > > > requirements around uniqueness of inode numbers". > > > This violation can cause any of the parent's children to wrongly match > > > get_name() not only '.' and '..' and fail the d_inode sanity check after > > > lookup_one(). > > > > > > I understand why this would be common with parent of snapshot dir, > > > but the only fs that support snapshots that I know of (btrfs, bcachefs) > > > do implement ->get_name(), so which filesystem did you encounter > > > this behavior with? can it be fixed by implementing a snapshot > > > aware ->get_name()? > > > > > > > This patch just ensures that we skip '.' and '..' rather than allowing a > > > > match. > > > > > > I agree that skipping '.' and '..' makes sense, but... > > > > Does skipping '.' and '..' make sense for file systems that do > > It makes sense because if the child's name in its parent would > have been "." or ".." it would have been its own parent or its own > grandparent (ELOOP situation). > IOW, we can safely skip "." and "..", regardless of anything else. This new comment: + /* Ignore the '.' and '..' entries */ then seems inadequate to explain why dot and dot-dot are now never matched. Perhaps the function's documenting comment could expand on this a little. I'll give it some thought. > > indeed guarantee inode number uniqueness? Given your explanation > > here, I'm wondering whether the generic get_name() function is the > > right place to address the issue. -- Chuck Lever