RE: NFSv3 TCP socket stuck when all slots used and server goes away

"Myklebust, Trond" <Trond.Myklebust@xxxxxxxxxx> · Wed, 6 Mar 2013 14:06:01 +0000

> -----Original Message-----
> From: linux-nfs-owner@xxxxxxxxxxxxxxx [mailto:linux-nfs-
> owner@xxxxxxxxxxxxxxx] On Behalf Of Simon Kirby
> Sent: Wednesday, March 06, 2013 4:52 AM
> To: linux-nfs@xxxxxxxxxxxxxxx
> Subject: NFSv3 TCP socket stuck when all slots used and server goes away
> 
> We had an issue with an Pacemaker/CRM HA-NFSv3 setup where one
> particular export hit an XFS locking issue on one node and got completely
> stuck.
> Upon failing over, service recovered for all clients that hadn't hit the mount
> since the issue occurred, but almost all of the usual clients (which also statfs
> commonly as a monitoring check) sat forever (>20
> minutes) without reconnecting.
> 
> It seems that the clients filled the RPC slots with requests over the TCP
> socket to the NFS VIP and the server ack'd everything at the TCP layer, but
> was not able to reply to anything due to the FS locking issue. When we failed
> over the VIP to the other node, service was restored, but the clients stuck
> this way continued to sit with nothing to tickle the TCP layer. netstat shows a
> socket with no send-queue, in ESTABLISHED state, and with no timer
> enabled:
> 
> tcp        0      0 c:724         s:2049       ESTABLISHED -                off (0.00/0/0)
> 
> The mountpoint options used are: rw,hard,intr,tcp,vers=3
> 
> The export options are:
> rw,async,hide,no_root_squash,no_subtree_check,mp
> 
> Is this expected behaviour? I suspect if TCP keepalived were enabled, the
> socket would eventually get torn down as soon as the client tries to send
> something to the (effectively rebooted / swapped) NFS server and gets an
> RST. However, as-is, there seems to be nothing here that would eventually
> cause anything to happen. Am I missing something?

Which client? Did the server close the connection?

Cheers
  Trond
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html