RE: [PATCH] NFS41: Drop lseg ref before fallthru to MDS

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi, Trond,

> -----Original Message-----
> From: linux-nfs-owner@xxxxxxxxxxxxxxx [mailto:linux-nfs-owner@xxxxxxxxxxxxxxx]
> On Behalf Of Myklebust, Trond
> Sent: Tuesday, July 26, 2011 11:50 PM
> To: Peng Tao
> Cc: Peng, Tao; linux-nfs@xxxxxxxxxxxxxxx; bhalevy@xxxxxxxxxx
> Subject: RE: [PATCH] NFS41: Drop lseg ref before fallthru to MDS
> 
> > -----Original Message-----
> > From: Peng Tao [mailto:bergwolf@xxxxxxxxx]
> > Sent: Tuesday, July 26, 2011 11:37 AM
> > To: Myklebust, Trond
> > Cc: tao.peng@xxxxxxx; linux-nfs@xxxxxxxxxxxxxxx; bhalevy@xxxxxxxxxx
> > Subject: Re: [PATCH] NFS41: Drop lseg ref before fallthru to MDS
> >
> > Hi, Trond,
> >
> > On Tue, Jul 26, 2011 at 3:13 AM, Trond Myklebust
> > <Trond.Myklebust@xxxxxxxxxx> wrote:
> > > On Wed, 2011-07-20 at 01:52 -0400, tao.peng@xxxxxxx wrote:
> > >> Hi, Trond,
> > >>
> > >> Any comments on this patch? I still get kernel crash when pnfs write
> > is attempted but fails and calls pnfs_ld_write_done(). It seems object
> > layout uses the same code path as well. But I don't find the patch in
> > either your tree or Benny's tree. Are there any concerns?
> > >>
> > >> Thanks,
> > >> Tao
> > >
> > > The whole pnfs_ld_write_done thing is bogus and needs to be replaced
> > > with something sane. It is trying to initiate a WRITE RPC call with
> > the
> > > wrong block size, and is calling the MDS rpc_call_done() and
> > > rpc_release() with an uninitialised rpc task pointer.
> > >
> > > Ditto for pnfs_ld_read_done.
> > Thanks for your explanation. Is there any plan on how to fix
> > pnfs_ld_read/write_done? Basically, we would need an interface that
> > can redirect the IO to MDS if pnfs_error is set or do all necessary
> > cleanup work to end read/write if pnfs_error is 0. IMHO, the
> > recoalesce logic need to access nfs_pageio_descriptor but we do not
> > have that information at pnfs_ld_read/write_done.
> 
> As far as I can see, the right thing to do is to mark the layout as invalid and then
> redirty the page. It should be easy to have fsync() re-send the pages in this case.
> These should be extremely rare events, since we expect to catch most of the pNFS
> failures when we do the actual LAYOUTGET in the ->pg_init().
Another problem that just comes into my head is readpage failures. How do we reschdule read failures if not re-sending it to MDS?

Thanks,
Tao
> 
> My main worry is for aio/dio where there is no good mechanism for retrying. I'm still
> working on that...
> 
> Cheers
>   Trond
> �{.n�+�������+%��lzwm��b�맲��r��zX��߲)���w*
> jg��������ݢj/���z�ޖ��2�ޙ���&�)ߡ�a�����G���h��j:+v���w�٥
��.n��������+%������w��{.n�����{��w���jg��������ݢj����G�������j:+v���w�m������w�������h�����٥



[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux