Re: 3.14.27 client hang on specific file

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 18 Dec 2014 19:24:21 -0800
Brian De Wolf <bldewolf@xxxxxxx> wrote:

> After updating our kernel from 3.4.x to 3.14.27 (along with nfs-utils
> 1.2.9), we've had a strange issue with our sec=krb5p NFSv4 mounts.
> My initial light testing went fine, but sometimes, on any given host,
> a specific file will no longer be accessible.  Any attempt to access
> it causes the process to go into uninterruptible sleep.

In case anyone else sees a similar issue, this is what I found.

3.4, 3.12 and 3.14 see stalls when accessing a Solaris 10 server.  3.4
and 3.12 recover by reconnecting after a 60 second timeout, but 3.14
hangs forever. It seems like the 3.14 timeout is broken, which is
pretty painful.  It can be recovered by resetting the TCP connection
(yay iptables), but will eventually stall again.

3.16 doesn't stall, so I assume something was fixed between 3.14 and
3.16 to handle whatever problems occur with the Solaris 10 server.
3.12 and 3.14 also don't see stalls when accessing an OmniOS server,
which makes me think the bug is on the Solaris side but triggers poor
error handling on the Linux side (that was fixed in 3.16).

So the end result is pretty obvious:  it's time to upgrade.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux