On Tue, 2020-05-12 at 16:13 +0100, Luis Henriques wrote: > Hi Jeff, > > I've been looking at xfstest generic/467 failure in cephfs, and I simply > can not decide if it's a genuine bug on ceph kernel code. Since you've > recently been touching the ceph_unlink code maybe you could help me > understanding what's going on. > > generic/467 runs a couple of tests using src/open_by_handle, but the one > failing can be summarized with the following: > > - get a handle to /cephfs/myfile using name_to_handle_at(2) > - open(2) file /cephfs/myfile > - unlink(2) /cephfs/myfile > - drop caches > - open_by_handle_at(2) => returns -ESTALE > > This test succeeds opening the handle with other (local) filesystems > (maybe I should run it with other networked filesystem such as NFS). > > The -ESTALE is easy to trace to __fh_to_dentry, where inode->i_nlink is > checked against 0. My question is: should we really be testing the > i_nlink here? We dropped the name, but the file may still be there (as in > this case). > > I guess I'm missing something, but hopefully you'll be able to shed some > light on this. Thanks in advance for any help you may provide! Yeah, I took a brief look at this a while back and never got back to looking at it again. I think cephfs's behavior is wrong here. We should be able to look up an open-but-unlinked file by filehandle. That said those checks went in via commit 570df4e9c23f8, and it looks like it was deliberately added to __fh_to_dentry. I'm unclear as to why. It may be interesting to remove the i_nlink checks and see whether it breaks anything? -- Jeff Layton <jlayton@xxxxxxxxxx>