On Thu, 2013-07-25 at 10:33 -0400, Jeff Layton wrote: > On Thu, 25 Jul 2013 14:24:30 +0000 > "Myklebust, Trond" <Trond.Myklebust@xxxxxxxxxx> wrote: > > > On Thu, 2013-07-25 at 10:11 -0400, Jeff Layton wrote: > > > > > What might be helpful is to do some network captures when the problem > > > occurs. What we want to know is whether the ESTALE errors are coming > > > from the server, or if the client is generating them. That'll narrow > > > down where we need to look for problems. > > > > Hmm... Shouldn't ESTALE always be repackaged as ENOENT by the VFS, now > > that your patchset has gone upstream, Jeff? > > > > I don't think so... > > On something path-based then that might make sense (or maybe we should > declare a new ERACE error like Al once suggested and return that). If > you're doing a write() on a fd that you previously opened but the inode > has disappeared on the server, then -ESTALE clearly seems valid. EBADF would be a valid POSIX alternative. Your file descriptor is clearly invalid if there is no file.. > There are other problematic cases too... > > Suppose I do stat(".", ...); ? Does an -ENOENT error make sense at that point? > On an XFS partition: [trondmy@leira tmp]$ mkdir gnurr [trondmy@leira tmp]$ cd gnurr [trondmy@leira gnurr]$ rmdir ../gnurr [trondmy@leira gnurr]$ pwd -P pwd: error retrieving current directory: getcwd: cannot access parent directories: No such file or directory So yes, it's actually the preferred error for most filesystems. > Also, since we only retry once on an ESTALE error, returning that is a > pretty clear indicator that you raced with some other metadata > operations. ENOENT is not as informative... > Agreed, but ESTALE is not a valid POSIX error, so it is theoretically non-portable. -- Trond Myklebust Linux NFS client maintainer NetApp Trond.Myklebust@xxxxxxxxxx www.netapp.com ��.n��������+%������w��{.n�����{��w���jg��������ݢj����G�������j:+v���w�m������w�������h�����٥