----- Original Message ----- > From: "Weston Andros Adamson" <dros@xxxxxxxxxx> > To: "Tigran Mkrtchyan" <tigran.mkrtchyan@xxxxxxx> > Cc: "<linux-nfs@xxxxxxxxxxxxxxx>" <linux-nfs@xxxxxxxxxxxxxxx>, "Andy Adamson" <William.Adamson@xxxxxxxxxx>, "Steve > Dickson" <steved@xxxxxxxxxx> > Sent: Thursday, October 10, 2013 4:35:25 PM > Subject: Re: DoS with NFSv4.1 client > > Well, it'd be nice not to loop forever, but my question remains, is this due > to a server bug (the DS not knowing about new stateid from MDS)? > Up to now, we have pushed open state id to the DS only on LAYOUTGET. This have to be changed, as the behaviour is not spec compliant. Tigran. > -dros > > On Oct 10, 2013, at 10:14 AM, Weston Andros Adamson <dros@xxxxxxxxxx> wrote: > > > So is this a server bug? It seems like the client is behaving correctly... > > > > -dros > > > > On Oct 10, 2013, at 5:56 AM, "Mkrtchyan, Tigran" <tigran.mkrtchyan@xxxxxxx> > > wrote: > > > >> > >> > >> Today we was 'luck' to have such situation at day time. > >> Here is what happens: > >> > >> The client sends an OPEN and gets an open state id. > >> This is followed by LAYOUTGET ... and READ to DS. > >> At some point, server returns back BAD_STATEID. > >> This triggers client to issue a new OPEN and use > >> new open stateid with READ request to DS. As new > >> stateid is not known to DS, it keeps returning > >> BAD_STATEID and becomes an infinite loop. > >> > >> Regards, > >> Tigran. > >> > >> > >> > >> ----- Original Message ----- > >>> From: "Tigran Mkrtchyan" <tigran.mkrtchyan@xxxxxxx> > >>> To: linux-nfs@xxxxxxxxxxxxxxx > >>> Cc: "Andy Adamson" <william.adamson@xxxxxxxxxx>, "Steve Dickson" > >>> <steved@xxxxxxxxxx> > >>> Sent: Wednesday, October 9, 2013 10:48:32 PM > >>> Subject: DoS with NFSv4.1 client > >>> > >>> > >>> Hi, > >>> > >>> last night we got a DoS attack with one of the NFS clients. > >>> The farm node, which was accessing data with pNFS, > >>> went mad and have tried to kill dCache NFS server. As usually > >>> this have happened over night and we was not able to > >>> get a network traffic or bump the debug level. > >>> > >>> The symptoms are: > >>> > >>> client starts to bombard the MDS with OPEN requests. As we see > >>> state created on the server side, the requests was processed by > >>> server. Nevertheless, for some reason, client did not like it. Here > >>> is the result of mountstats: > >>> > >>> OPEN: > >>> 17087065 ops (99%) 1 retrans (0%) 0 major timeouts > >>> avg bytes sent per op: 356 avg bytes received per op: 455 > >>> backlog wait: 0.014707 RTT: 4.535704 total execute time: 4.574094 > >>> (milliseconds) > >>> CLOSE: > >>> 290 ops (0%) 0 retrans (0%) 0 major timeouts > >>> avg bytes sent per op: 247 avg bytes received per op: 173 > >>> backlog wait: 308.827586 RTT: 1748.479310 total execute time: > >>> 2057.365517 > >>> (milliseconds) > >>> > >>> > >>> As you can see there is a quite a big difference between number of open > >>> and > >>> close requests. > >>> The same picture we can see on the server side as well: > >>> > >>> NFSServerV41 Stats: average±stderr(ns) min(ns) > >>> max(ns) Sampes > >>> DESTROY_SESSION 26056±4511.89 13000 > >>> 97000 17 > >>> OPEN 1197297± 0.00 816000 > >>> 31924558000 54398533 > >>> RESTOREFH 0± 0.00 0 > >>> 25018778000 54398533 > >>> SEQUENCE 1000± 0.00 1000 > >>> 26066722000 55601046 > >>> LOOKUP 4607959± 0.00 375000 > >>> 26977455000 32118 > >>> GETDEVICEINFO 13158±100.88 4000 > >>> 655000 11378 > >>> CLOSE 16236211± 0.00 5000 > >>> 21021819000 20420 > >>> LAYOUTGET 271736361± 0.00 10003000 > >>> 68414723000 21095 > >>> > >>> The last column is the number of requests. > >>> > >>> This is with RHEL6.4 as the client. By looking at the code, > >>> I can see a loop at nfs4proc.c#nfs4_do_open() which can be > >>> the cause of the problem. Nevertheless, I can't > >>> fine any reason why this look turned into an 'infinite' one. > >>> > >>> At the and our server ran out of memory and we have returned > >>> NFSERR_SERVERFAULT to the client. This triggered client to > >>> reestablish the session and all open state ids was > >>> invalidated and cleaned up. > >>> > >>> I am still trying to reproduce this behavior (on client > >>> and server) and any hint is welcome. > >>> > >>> Tigran. > >>> -- > >>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > >>> the body of a message to majordomo@xxxxxxxxxxxxxxx > >>> More majordomo info at http://vger.kernel.org/majordomo-info.html > >>> > >> -- > >> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > >> the body of a message to majordomo@xxxxxxxxxxxxxxx > >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > > the body of a message to majordomo@xxxxxxxxxxxxxxx > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html