Re: nfs client: Now you see it, now you don't (aka spurious ESTALE errors)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 25 Jul 2013 13:45:15 +0000
Larry Keegan <lk@xxxxxxxxxxxxxxx> wrote:

> Dear Chaps,
> 
> I am experiencing some inexplicable NFS behaviour which I would like to
> run past you.
> 
> I have a linux NFS server running kernel 3.10.2 and some clients
> running the same. The server is actually a pair of identical
> machines serving up a small number of ext4 filesystems atop drbd. They
> don't do much apart from serve home directories and deliver mail
> into them. These have worked just fine for aeons.
> 
> The problem I am seeing is that for the past month or so, on and off,
> one NFS client starts reporting stale NFS file handles on some part of
> the directory tree exported by the NFS server. During the outage the
> other parts of the same export remain unaffected. Then, some ten
> minutes to an hour later they're back to normal. Access to the affected
> sub-directories remains possible from the server (both directly and via
> nfs) and from other clients. There do not appear to be any errors on
> the underlying ext4 filesystems.
> 
> Each NFS client seems to get the heebie-jeebies over some directory or
> other pretty much independently. The problem affects all of the
> filesystems exported by the NFS server, but clearly I notice it first
> in home directories, and in particular in my dot subdirectories for
> things like my mail client and browser. I'd say something's up the
> spout about 20% of the time.
> 
> The server and clients are using nfs4, although for a while I tried
> nfs3 without any appreciable difference. I do not have CONFIG_FSCACHE
> set.
> 
> I wonder if anyone could tell me if they have ever come across this
> before, or what debugging settings might help me diagnose the problem?
> 
> Yours,
> 
> Larry
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


Were these machines running older kernels before this started
happening? What kernel did you upgrade from if so?

What might be helpful is to do some network captures when the problem
occurs. What we want to know is whether the ESTALE errors are coming
from the server, or if the client is generating them. That'll narrow
down where we need to look for problems.

-- 
Jeff Layton <jlayton@xxxxxxxxxx>
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux