On Thu, Feb 3, 2011 at 9:52 AM, Tommi Virtanen <tommi.virtanen@xxxxxxxxxxxxx> wrote: > On Thu, Feb 03, 2011 at 09:29:32AM -0800, Sage Weil wrote: >> Which leaves us with a final problem: what if the fh is generated for >> /foo/bar, but bar is renamed to /baz/bar, bar drops out of all caches, and >> the client tries to use the fh. We're still stuck with ESTALE in that >> case. The only real solution there is to include a backpointer on the >> file's data object. This is doable, but comes at a cost. We could make >> it optional, and/or mitigate it somewhat (backpointer is only created once >> a file is renamed, or something like that). >> >> I'm not really sure to what lengths a server is supposed to go to avoid >> ESTALE. I seem to remember that NFSv4 has a different class of fh's that >> are allowed to expire. I'm not sure how that helps, though; it seems >> likeif a client has a file open that is renamed by another node and then >> idle for long enough and then tries to read it'll still be screwed, >> regardless of what the server does/does not promise the client. > > NFSv4 volatile filehandles move away from the whole "stale" > terminology into "expiring" filehandles, which a client SHOULD recover > from, and that's said with fairly strong language in RFC3530. The > volatile filehandles may go away at any moment (for FH4_VOLATILE_ANY). > > The RFC suggests clients remember the full path of every volatile > filehandle, and points out that doesn't let you recover if someone > else renamed the file.. which means your "final problem" above is > still a problem, and smells unavoidable. But at least shifting > responsibility for remembering the path to the client makes recovery > easy in the typical case. > > If the real-world support is there, I'd say NFSv4 is the way to go, > for future Ceph re-exporting. I was playing around with implementing this. I was trying to get the ceph client's export functions to return NFS4ERR_FHEXPIRED instead of ESTALE (hoping that my nfs4 clients would then attempt the full lookup again). I noticed also that the mds itself can also return an ESTALE to the ceph kernel client, which seems to be getting propagated back to the NFS client. I'm wondering where I could intercept that and send back an expiry notice? -Brian > > -- > :(){ :|:&};: > -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html