On 2010-10-01 04:14, Fred Isaman wrote: > On Thu, Sep 30, 2010 at 5:52 PM, Jim Rees <rees@xxxxxxxxx> wrote: >> Benny Halevy wrote: >> >> Jim, would you mind retesting with pnfs-all-2.6.36-rc6-2010-09-30? >> Not that there's any possible fix there, but a fresh Oops could >> help, if you can reproduce it. >> >> Will do, probably after an important meeting I have at 6:00 this evening. > > There is a problem with the LAYOUTGET error handling, which is > probably what Jim is hitting (the block servers are much more likely > to send RETRYLATER). I'll send in a fix tomorrow morning. One problem I can see is that nfs4_layoutget_release frees calldata (a.k.a. lgp) which is reused later if we retry. We should either keep a reference count on it or clone it internally in _nfs4_proc_layoutget for each call. Since the calls are essentially synchronous the caller and allocator (e.g. send_layoutget) can just free the call data (or dereference, if we keep a refcount). Same for layoutcommit and layoutreturn. Benny > > Fred > >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in >> the body of a message to majordomo@xxxxxxxxxxxxxxx >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html