Today we was 'luck' to have such situation at day time. Here is what happens: The client sends an OPEN and gets an open state id. This is followed by LAYOUTGET ... and READ to DS. At some point, server returns back BAD_STATEID. This triggers client to issue a new OPEN and use new open stateid with READ request to DS. As new stateid is not known to DS, it keeps returning BAD_STATEID and becomes an infinite loop. Regards, Tigran. ----- Original Message ----- > From: "Tigran Mkrtchyan" <tigran.mkrtchyan@xxxxxxx> > To: linux-nfs@xxxxxxxxxxxxxxx > Cc: "Andy Adamson" <william.adamson@xxxxxxxxxx>, "Steve Dickson" <steved@xxxxxxxxxx> > Sent: Wednesday, October 9, 2013 10:48:32 PM > Subject: DoS with NFSv4.1 client > > > Hi, > > last night we got a DoS attack with one of the NFS clients. > The farm node, which was accessing data with pNFS, > went mad and have tried to kill dCache NFS server. As usually > this have happened over night and we was not able to > get a network traffic or bump the debug level. > > The symptoms are: > > client starts to bombard the MDS with OPEN requests. As we see > state created on the server side, the requests was processed by > server. Nevertheless, for some reason, client did not like it. Here > is the result of mountstats: > > OPEN: > 17087065 ops (99%) 1 retrans (0%) 0 major timeouts > avg bytes sent per op: 356 avg bytes received per op: 455 > backlog wait: 0.014707 RTT: 4.535704 total execute time: 4.574094 > (milliseconds) > CLOSE: > 290 ops (0%) 0 retrans (0%) 0 major timeouts > avg bytes sent per op: 247 avg bytes received per op: 173 > backlog wait: 308.827586 RTT: 1748.479310 total execute time: 2057.365517 > (milliseconds) > > > As you can see there is a quite a big difference between number of open and > close requests. > The same picture we can see on the server side as well: > > NFSServerV41 Stats: average±stderr(ns) min(ns) > max(ns) Sampes > DESTROY_SESSION 26056±4511.89 13000 > 97000 17 > OPEN 1197297± 0.00 816000 > 31924558000 54398533 > RESTOREFH 0± 0.00 0 > 25018778000 54398533 > SEQUENCE 1000± 0.00 1000 > 26066722000 55601046 > LOOKUP 4607959± 0.00 375000 > 26977455000 32118 > GETDEVICEINFO 13158±100.88 4000 > 655000 11378 > CLOSE 16236211± 0.00 5000 > 21021819000 20420 > LAYOUTGET 271736361± 0.00 10003000 > 68414723000 21095 > > The last column is the number of requests. > > This is with RHEL6.4 as the client. By looking at the code, > I can see a loop at nfs4proc.c#nfs4_do_open() which can be > the cause of the problem. Nevertheless, I can't > fine any reason why this look turned into an 'infinite' one. > > At the and our server ran out of memory and we have returned > NFSERR_SERVERFAULT to the client. This triggered client to > reestablish the session and all open state ids was > invalidated and cleaned up. > > I am still trying to reproduce this behavior (on client > and server) and any hint is welcome. > > Tigran. > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html