On Mon, Apr 13, 2009 at 9:47 AM, Daniel Stickney <dstickney@xxxxxxxxxx> wrote: > To add a little more info, in a post on April 10th titled "NFSv3 Client Timeout on 2.6.27" Bryan mentioned that his client socket was in state FIN_WAIT2, and server in CLOSE_WAIT, which is exactly what I am seeing here. Since my problems originated after upgrading to Ubuntu intrepid in a 'etch -> hardy -> intrepid' cycle, and hardy contained 2.6.24, I wonder if the regression was in: commit e06799f958bf7f9f8fae15f0c6f519953fb0257c Author: Trond Myklebust <Trond.Myklebust@xxxxxxxxxx> Date: Mon Nov 5 15:44:12 2007 -0500 SUNRPC: Use shutdown() instead of close() when disconnecting a TCP socket By using shutdown() rather than close() we allow the RPC client to wait for the TCP close handshake to complete before we start trying to reconnect using the same port. We use shutdown(SHUT_WR) only instead of shutting down both directions, however we wait until the server has closed the connection on its side. Signed-off-by: Trond Myklebust <Trond.Myklebust@xxxxxxxxxx> $ git describe e06799f958bf7f9f8fae15f0c6f519953fb0257c --contains v2.6.25-rc1~1146^2~105 I came in today to find that the one machine outside of production that was hung that I could toy with eventually fixed itself, albeit five days later. Apr 8 12:42:34 bvt-was02 kernel: [3706362.490101] nfs: server file01.prod.example.com not responding, still trying Apr 13 12:09:59 bvt-was02 kernel: [4136407.174292] nfs: server file01.prod.example.com OK There looks like there are a lot of additional timeouts added in 2.6.30-rc1, so perhaps I'll compile from source and wait to see if this happens again on the test machines. Bryan -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html