On Wed, Jul 20, 2011 at 12:23:42PM -0700, Ray Van Dolson wrote: > We have a couple legacy CentOS (RHEL)-based appliances with slightly > dated NFS implementations. > > Server (CentOS 4 based): > > nfs-utils-1.0.6-70.EL4 > Kernel 2.6.9-42.0.10.plus.c4smp > > Client (CentOS 5 based): > > nfs-utils-1.0.9-42.el5.x86_64 > Kernel 2.6.18-164.15.1.el5 > > The client has a long-lived NFSv3 mount to the server that sometimes > stops responding (blocks). We can lazy unmount it, but subsequent > mount requests hang and the following is observed via tcpdump: > > 1. Client GETPORT for NFS service succeeds. > 2. Client GETPORT for MOUNT succeeds > 3. Client MNT call succeeds (server gives valid response including > file handle) > 4. Client sends a SYN packet to NFS port on server > 5. Server responds with ACK *only* > > When we bounce the NFS daemon on the server, everything starts working > and in step 5 above, we get a SYN,ACK as expected in response to #4, > and everything proceeds along nicely. > > Does this jog anybody on a long-ago fixed bug? I'm thinking updating > the kernel and nfs-utils on the server will likely help, but would love > to find where behavior like the above is referenced as a "bug". > > Thanks, > Ray After thinking on this a bit more, I'm wondering if perhaps the server side had a connection still "open" (didn't check with netstat) and thus sent back only the ACK. Maybe in this case the client should respond with a RST or something else to indicate we need to start from scratch? Is there a way, on the server side to kill an ESTABLISHED TCP connection (specifically an NFS connection?)? Probably setting a connection timeout value via /proc ... I'm thinking on the client side I could inject a RST packet to the server to clean things up? Ray -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html