The _only_ reason why a pNFS files client would ever want to send a LAYOUTRETURN is in order to have the MDS take action to fence off any outstanding writes to the DS. The _only_ case where that is actually an important issue is when something happens to the DS which forces the client to fall back to writing through the MDS. _ALL_ other cases are trivially covered by the existing NFSv4 state model in that when the client unlocks and/or closes the file, then the lock/open stateids that are used in the READ and WRITE operations will be updated, and will cause those operations to be rejected with a BAD_STATEID error. This fencing model is irrespective of whether or not a layout is held, and is irrespective of whether the READ/WRITE was sent to the MDS or the DS. IOW: if pNFS files servers don't want to do this kind of fencing, then I suggest we file an errata that labels the LAYOUTRETURN operation as mandatory to not implement for those servers. On Mon, 2012-06-11 at 15:02 -0400, david.noveck@xxxxxxx wrote: > > And again, please explain why do you want it. What is wrong with the > > case we all agree with? ie: "Client can not call LAYOUTRETURN until > > all in-flight RPCs return, with or without an error" > > It's a recipe for data corruption. If, as Andy explained, he starts doing > IO's (let's suppose WRITEs) to the MDS any lingering WRITEs to the DS > since they reflect an earlier state of affairs can cause data corruption. > > There are three ways to prevent those lingering DS writes from corrupting > data: > > 1) Doing a LAYOUTRETURN > 2) waiting until the IO's return. > 3) "magically plugging the network interface". > > > Since there is no way to do 3), saying that you only can do 1) until after > 2) is done is essentially going to mean: > > a) that it may take a very long time: > b) that you will only do it, when it is no longer useful. > > If you do 1) asap, then the lingering DS write problem is gone sooner, > and that's a good thing. > > -----Original Message----- > From: nfsv4-bounces@xxxxxxxx [mailto:nfsv4-bounces@xxxxxxxx] On Behalf Of Boaz Harrosh > Sent: Monday, June 11, 2012 2:41 PM > To: Andy Adamson > Cc: Andy Adamson; NFS list; Trond Myklebust; NFSv4 > Subject: Re: [nfsv4] RFC 5661 LAYOUTRETURN clarification. > > On 06/11/2012 07:01 PM, Andy Adamson wrote: > > > I'm coding file layout data server recovery for the Linux NFS client, > > and came across an issue with LAYOUTRETURN that > > could use some comment from the list. > > > > The error case I'm handling is an RPC layer dis-connection error > > during heavy WRITE i/o to a file layout data server. Our response is > > to internally mark the deviceid as invalid which prevents all pNFS > > calls using the deviceid - e.g. no new I/O using any layout that uses > > the invalid deviceid, and to redirect all I/O to the MDS (any queued > > RPC request that has not been sent is redirected to the MDS). > > > > Plus - and here is where the clarification is needed - we immediately > > send a LAYOUTRETURN for any layout with in-flight requests to the > > dis-connected data server. By in-flight I mean transmitted WRT the > > RPC layer. The purpose of this LAYOUTRETURN is to notify the file > > layout MDS to fence the DS for the specified LAYOUTs, as the WRITEs > > will also be sent to the MDS. > > > > > I do not disagree with this completely. The point here is very fine > grained and should be specified explicitly. I would like to see text > as of something like. > > There are 3 types of in-flght RPC/IO > 1. Client has sent RPC header + all of associated data and is waiting > for DS WRITE/READ_DONE reply. > > (For me this case can be, client may return LAYOUTRETURN as your > suggestion) > > 2. Client has sent the RPC header but has got stuck sending the rest > of the RPC message. Then received a network disconnect. This is the > most common part. Putting aside the RPC that got the error for a second. > The most important is what to do with parallel RPC/IO which are in this > state. Are parallel RPCs allowed to continue sending network packets > after the LAYOUTRETURN was sent? > > The specific RPC that got stuck is not interesting because it's kind of > 1.5, We are not going to send any bytes on that channel. The interesting > is these other DSs which are still streaming > > 3. Client has some internal RPC queue which do to some client parallelism > will start sending RPC header + data after the LAYOUTRETURN was sent > > What my point was that with the code you submitted we are clearly violating > 2. and even 3. Because I do not see anything avoiding this. > > And if the STD allows you 2 and 3. Then that's a big change to the concept. > Not like you let it seem. > > > I contend that sending the LAYOUTRETURN in this error case does not > > violate the two sections of RFC 5661 below, as the client has stopped > > sending any I/O requests using the returned layout. > > > > > I would not mind if this was true. That is if the LAYOUTRETURN was > a very clear barrier where our client would "magically" completely > plug the network interface and will not continue to send a single > byte on the wire to *any* DS involved with the layout. That's fine. > > That is only allow sate 1 and 1.5 RPCs above. Some/all bytes where > presented on the wire, until the LAYOUTRETURN, from which point all > RPCs are hard aborted and not a single byte is sent. > > > > Others contend that since the in-flight RPCs reference the returned > > layout, the client is still 'using' the layout with these in-flight > > requests, and can not call LAYOUTRETURN until all in-flight RPCs > > return, with or without an error. > > > > > With our client code I don't see how the guaranty of 2 and 3 above > will happen without actually implementing this here. > > So in principal I agree with your principle, I only do not agree > with your practice. In your new code you are violating 2 and 3 > which are not to be allowed. > > And again, please explain why do you want it. What is wrong with the > case we all agree with? ie: "Client can not call LAYOUTRETURN until > all in-flight RPCs return, with or without an error" > > Thanks > Boaz > > > > > Section 18.44.3 - the description section of the LAYOUTRETURN operation: > > > > After this call, > > the client MUST NOT use the returned layout(s) and the associated > > storage protocol to access the file data. > > > > Section 13.6 Operations Sent to NFSv4.1 Data Servers > > > > As described in Section 12.5.1, a client > > MUST NOT send an I/O to a data server for which it does not hold a > > valid layout; the data server MUST reject such an I/O. > > > > > > -->Andy > > > _______________________________________________ > nfsv4 mailing list > nfsv4@xxxxxxxx > https://www.ietf.org/mailman/listinfo/nfsv4 > -- Trond Myklebust Linux NFS client maintainer NetApp Trond.Myklebust@xxxxxxxxxx www.netapp.com ��.n��������+%������w��{.n�����{��w���jg��������ݢj����G�������j:+v���w�m������w�������h�����٥