Re: [PATCH RFC] nfsd: serialize layout stateid morphing operations

Christoph Hellwig <hch@xxxxxx> · Mon, 7 Dec 2015 14:07:16 +0100



On Sat, Dec 05, 2015 at 07:24:09AM -0500, Jeff Layton wrote:
> If we treat NFS4_OK and NFS4ERR_DELAY equivalently, then we're
> expecting the client to eventually return NFS4ERR_NOMATCHING_LAYOUT (or
> a different error) to break the cycle of retransmissions. But, HZ/100
> is enough time for the client to return a layout and request a new one.
> We may never see that error -- only a continual cycle of
> CB_LAYOUTRECALL/LAYOUTRETURN/LAYOUTGET.
> 
> I think we need a more reliable way to break that cycle so we don't end
> up looping like that. We should either cancel any active callbacks
> before reallowing LAYOUTGETs, or move the timeout handling outside of
> the RPC state machine (like Bruce was suggesting).

We block all new LAYOUTGETS as long as fi_lo_recalls is non-zero,
and we only only decrement it from nfsd4_cb_layout_release.  The
way I understand the RPC state machine that means we block new LAYOUTGETS
until we have successfully finished the recall.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html