On 2010-12-16 20:14, Trond Myklebust wrote: > On Thu, 2010-12-16 at 19:42 +0200, Benny Halevy wrote: >> On 2010-12-16 19:35, Trond Myklebust wrote: >>> On Thu, 2010-12-16 at 18:24 +0200, Benny Halevy wrote: >>>> On 2010-12-16 17:55, Trond Myklebust wrote: >>>>> OK, so why not just go the whole hog and do that for all rare cases, >>>>> including the one where the server recalls a layout segment that we >>>>> happen to be doing I/O to? >>>>> >>>>> The case we should be optimising for is the one where the layout is >>>>> recalled, and no I/O to that segment is in progress. For that case, >>>>> returning OK, then doing the LAYOUTRETURN instead of just returning >>>>> NOMATCHING_LAYOUT is clearly wrong: it adds a completely unnecessary >>>>> round trip to the server. Agreed? >>>> >>>> I agree that if the client can free the recalled layout synchronously >>>> and if it need not send a LAYOUTCOMMIT or LAYOUTRETURN (e.g. in the objects case) >>>> it can simply return NFS4ERR_NOMATCHING_LAYOUT. >>> >>> Objects and blocks != wave 2. We can cross that bridge when we get to >>> it. >>> >> >> Right. This patchset is destined as post wave2. > > In that case it has a very confusing title (which certainly caught me by > surprise). > >> >>>>> >>>>> As for the much rarer case of a recall of a layout that is in use, how >>>>> does LAYOUTRETURN speed things up? As far as I can see, the MDS is still >>>>> going to return NFS4ERR_DELAY to the client that requested the >>>>> conflicting LAYOUTGET. That client then has to resend this LAYOUTGET >>>>> request, at a time when the first client may or may not have returned >>>>> its layout segment. So how is LAYOUTRETURN going to make all this a fast >>>>> and scalable process? >>>>> >>>> >>>> First, the server does not have to poll the client and waste cpu and network >>>> resources on that. >>> >>> ...but this is a ____rare____ case. If you are seeing noticeable effects >>> on the network from this, then something is wrong. If that is ever the >>> case, then you should be writing through the MDS anyway. >>> >>> Furthermore, the MDS does need to be able to cope with NFS4ERR_DELAY >>> anyway, so why add the extra complexity to the client? >>> >>>> Second, for the competing client, with notifications, it too does not have >>>> to poll the server and can wait on getting the notification when the >>>> layout becomes available. >>> >>> There is no notification of layout availability in RFC5661. Lock >>> notification is for byte range locks, and device id notification is for >>> device ids. The rest is for directory notifications. >>> >> >> Hmm, CB_RECALLABLE_OBJ_AVAIL in response to loga_signal_layout_avail... > > Hmm indeed. Section 12.3 states: > > "CB_RECALLABLE_OBJ_AVAIL (Section 20.7) tells a client that a > recallable object that it was denied (in case of pNFS, a layout denied > by LAYOUTGET) due to resource exhaustion is now available." > > and 18.43.3 states: > > "If client sets loga_signal_layout_avail to TRUE, then it is registering > with the client a "want" for a layout in the event the layout cannot be > obtained due to resource exhaustion." > > I can't see how that is relevant to the case where a specific LAYOUTGET > requires a layout recall from another client. That's not resource > exhaustion. > > > Yeah, the phrasing is miserable. It should be useful for any reason making the layout temporarily unavailable. Yet another errata entry... Benny -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html