> -----Original Message----- > From: Peng Tao [mailto:bergwolf@xxxxxxxxx] > Sent: Tuesday, July 26, 2011 1:33 PM > To: Myklebust, Trond > Cc: tao.peng@xxxxxxx; linux-nfs@xxxxxxxxxxxxxxx; bhalevy@xxxxxxxxxx > Subject: Re: [PATCH] NFS41: Drop lseg ref before fallthru to MDS > > On Tue, Jul 26, 2011 at 11:50 PM, Myklebust, Trond > <Trond.Myklebust@xxxxxxxxxx> wrote: > >> -----Original Message----- > >> From: Peng Tao [mailto:bergwolf@xxxxxxxxx] > >> Sent: Tuesday, July 26, 2011 11:37 AM > >> To: Myklebust, Trond > >> Cc: tao.peng@xxxxxxx; linux-nfs@xxxxxxxxxxxxxxx; bhalevy@xxxxxxxxxx > >> Subject: Re: [PATCH] NFS41: Drop lseg ref before fallthru to MDS > >> > >> Hi, Trond, > >> > >> On Tue, Jul 26, 2011 at 3:13 AM, Trond Myklebust > >> <Trond.Myklebust@xxxxxxxxxx> wrote: > >> > On Wed, 2011-07-20 at 01:52 -0400, tao.peng@xxxxxxx wrote: > >> >> Hi, Trond, > >> >> > >> >> Any comments on this patch? I still get kernel crash when pnfs > write > >> is attempted but fails and calls pnfs_ld_write_done(). It seems > object > >> layout uses the same code path as well. But I don't find the patch > in > >> either your tree or Benny's tree. Are there any concerns? > >> >> > >> >> Thanks, > >> >> Tao > >> > > >> > The whole pnfs_ld_write_done thing is bogus and needs to be > replaced > >> > with something sane. It is trying to initiate a WRITE RPC call > with > >> the > >> > wrong block size, and is calling the MDS rpc_call_done() and > >> > rpc_release() with an uninitialised rpc task pointer. > >> > > >> > Ditto for pnfs_ld_read_done. > >> Thanks for your explanation. Is there any plan on how to fix > >> pnfs_ld_read/write_done? Basically, we would need an interface that > >> can redirect the IO to MDS if pnfs_error is set or do all necessary > >> cleanup work to end read/write if pnfs_error is 0. IMHO, the > >> recoalesce logic need to access nfs_pageio_descriptor but we do not > >> have that information at pnfs_ld_read/write_done. > > > > As far as I can see, the right thing to do is to mark the layout as > invalid and then redirty the page. It should be easy to have fsync() > re-send the pages in this case. These should be extremely rare events, > since we expect to catch most of the pNFS failures when we do the > actual LAYOUTGET in the ->pg_init(). > Agreed. This should be easier than re-coalescing and sending to MDS at > read/write_done. > > > > > My main worry is for aio/dio where there is no good mechanism for > retrying. I'm still working on that... > For dio, we may have to send the failed pages to MDS instead of > relying on next fsync() to retry. The problem isn't what to do, it is more one of _who_ does it. The rpciod/nfsiod queues aren't the ideal place to set up a resend since it involves allocating memory. Cheers Trond ��.n��������+%������w��{.n�����{��w���jg��������ݢj����G�������j:+v���w�m������w�������h�����٥