On Thu, 18 Dec 2014 19:24:21 -0800 Brian De Wolf <bldewolf@xxxxxxx> wrote: > After updating our kernel from 3.4.x to 3.14.27 (along with nfs-utils > 1.2.9), we've had a strange issue with our sec=krb5p NFSv4 mounts. > My initial light testing went fine, but sometimes, on any given host, > a specific file will no longer be accessible. Any attempt to access > it causes the process to go into uninterruptible sleep. In case anyone else sees a similar issue, this is what I found. 3.4, 3.12 and 3.14 see stalls when accessing a Solaris 10 server. 3.4 and 3.12 recover by reconnecting after a 60 second timeout, but 3.14 hangs forever. It seems like the 3.14 timeout is broken, which is pretty painful. It can be recovered by resetting the TCP connection (yay iptables), but will eventually stall again. 3.16 doesn't stall, so I assume something was fixed between 3.14 and 3.16 to handle whatever problems occur with the Solaris 10 server. 3.12 and 3.14 also don't see stalls when accessing an OmniOS server, which makes me think the bug is on the Solaris side but triggers poor error handling on the Linux side (that was fixed in 3.16). So the end result is pretty obvious: it's time to upgrade. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html