On Thu, 2010-12-16 at 19:42 +0200, Benny Halevy wrote: > On 2010-12-16 19:35, Trond Myklebust wrote: > > On Thu, 2010-12-16 at 18:24 +0200, Benny Halevy wrote: > >> On 2010-12-16 17:55, Trond Myklebust wrote: > >>> OK, so why not just go the whole hog and do that for all rare cases, > >>> including the one where the server recalls a layout segment that we > >>> happen to be doing I/O to? > >>> > >>> The case we should be optimising for is the one where the layout is > >>> recalled, and no I/O to that segment is in progress. For that case, > >>> returning OK, then doing the LAYOUTRETURN instead of just returning > >>> NOMATCHING_LAYOUT is clearly wrong: it adds a completely unnecessary > >>> round trip to the server. Agreed? > >> > >> I agree that if the client can free the recalled layout synchronously > >> and if it need not send a LAYOUTCOMMIT or LAYOUTRETURN (e.g. in the objects case) > >> it can simply return NFS4ERR_NOMATCHING_LAYOUT. > > > > Objects and blocks != wave 2. We can cross that bridge when we get to > > it. > > > > Right. This patchset is destined as post wave2. In that case it has a very confusing title (which certainly caught me by surprise). > > >>> > >>> As for the much rarer case of a recall of a layout that is in use, how > >>> does LAYOUTRETURN speed things up? As far as I can see, the MDS is still > >>> going to return NFS4ERR_DELAY to the client that requested the > >>> conflicting LAYOUTGET. That client then has to resend this LAYOUTGET > >>> request, at a time when the first client may or may not have returned > >>> its layout segment. So how is LAYOUTRETURN going to make all this a fast > >>> and scalable process? > >>> > >> > >> First, the server does not have to poll the client and waste cpu and network > >> resources on that. > > > > ...but this is a ____rare____ case. If you are seeing noticeable effects > > on the network from this, then something is wrong. If that is ever the > > case, then you should be writing through the MDS anyway. > > > > Furthermore, the MDS does need to be able to cope with NFS4ERR_DELAY > > anyway, so why add the extra complexity to the client? > > > >> Second, for the competing client, with notifications, it too does not have > >> to poll the server and can wait on getting the notification when the > >> layout becomes available. > > > > There is no notification of layout availability in RFC5661. Lock > > notification is for byte range locks, and device id notification is for > > device ids. The rest is for directory notifications. > > > > Hmm, CB_RECALLABLE_OBJ_AVAIL in response to loga_signal_layout_avail... Hmm indeed. Section 12.3 states: "CB_RECALLABLE_OBJ_AVAIL (Section 20.7) tells a client that a recallable object that it was denied (in case of pNFS, a layout denied by LAYOUTGET) due to resource exhaustion is now available." and 18.43.3 states: "If client sets loga_signal_layout_avail to TRUE, then it is registering with the client a "want" for a layout in the event the layout cannot be obtained due to resource exhaustion." I can't see how that is relevant to the case where a specific LAYOUTGET requires a layout recall from another client. That's not resource exhaustion. -- Trond Myklebust Linux NFS client maintainer NetApp Trond.Myklebust@xxxxxxxxxx www.netapp.com -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html