Re: [PATCH] NFS41: Drop lseg ref before fallthru to MDS

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jul 26, 2011 at 11:50 PM, Myklebust, Trond
<Trond.Myklebust@xxxxxxxxxx> wrote:
>> -----Original Message-----
>> From: Peng Tao [mailto:bergwolf@xxxxxxxxx]
>> Sent: Tuesday, July 26, 2011 11:37 AM
>> To: Myklebust, Trond
>> Cc: tao.peng@xxxxxxx; linux-nfs@xxxxxxxxxxxxxxx; bhalevy@xxxxxxxxxx
>> Subject: Re: [PATCH] NFS41: Drop lseg ref before fallthru to MDS
>>
>> Hi, Trond,
>>
>> On Tue, Jul 26, 2011 at 3:13 AM, Trond Myklebust
>> <Trond.Myklebust@xxxxxxxxxx> wrote:
>> > On Wed, 2011-07-20 at 01:52 -0400, tao.peng@xxxxxxx wrote:
>> >> Hi, Trond,
>> >>
>> >> Any comments on this patch? I still get kernel crash when pnfs write
>> is attempted but fails and calls pnfs_ld_write_done(). It seems object
>> layout uses the same code path as well. But I don't find the patch in
>> either your tree or Benny's tree. Are there any concerns?
>> >>
>> >> Thanks,
>> >> Tao
>> >
>> > The whole pnfs_ld_write_done thing is bogus and needs to be replaced
>> > with something sane. It is trying to initiate a WRITE RPC call with
>> the
>> > wrong block size, and is calling the MDS rpc_call_done() and
>> > rpc_release() with an uninitialised rpc task pointer.
>> >
>> > Ditto for pnfs_ld_read_done.
>> Thanks for your explanation. Is there any plan on how to fix
>> pnfs_ld_read/write_done? Basically, we would need an interface that
>> can redirect the IO to MDS if pnfs_error is set or do all necessary
>> cleanup work to end read/write if pnfs_error is 0. IMHO, the
>> recoalesce logic need to access nfs_pageio_descriptor but we do not
>> have that information at pnfs_ld_read/write_done.
>
> As far as I can see, the right thing to do is to mark the layout as invalid and then redirty the page. It should be easy to have fsync() re-send the pages in this case. These should be extremely rare events, since we expect to catch most of the pNFS failures when we do the actual LAYOUTGET in the ->pg_init().
Agreed. This should be easier than re-coalescing and sending to MDS at
read/write_done.

>
> My main worry is for aio/dio where there is no good mechanism for retrying. I'm still working on that...
For dio, we may have to send the failed pages to MDS instead of
relying on next fsync() to retry.


Thanks,
Tao

>
> Cheers
>  Trond
>
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux