Re: [PATCH 1/5] pNFS: recoalesce when ld write pagelist fails

Peng Tao <bergwolf@xxxxxxxxx> · Wed, 17 Aug 2011 17:44:42 +0800

Hi, Boaz,

On Wed, Aug 17, 2011 at 4:19 AM, Boaz Harrosh <bharrosh@xxxxxxxxxxx> wrote:
> On 08/16/2011 12:20 AM, tao.peng@xxxxxxx wrote:
>>
>> I tried to rewrite the write patch to handle failures inside
>> mds_ops->rpc_release. However, I get a problem w.r.t. "redirty and
>> rely on next flush". If the failed write is the *last flush*, we end
>> up with relying no one and the dirty pages are simply dropped. Do you
>> have any suggestions how to handle it?
>>
>> Thanks, Tao
>>
>
> Tao Hi.
>
> OK, I see what you mean. That would be a problem
>
> I had a totally different idea You know how today we just do:
>        nfs_initiate_write()
>
> Which is bad because it can actually dead lock because
> we are taking up the only thread that needs to service
> that call. Well I thought, with you thread-for-pnfs patch,
> can it not now work? I think it is worth a try?
The problem w/ directly calling nfs_initiate_write() is that we may
have pagelist length larger than server's rsize/wsize. So if client
sends all the pages in single READ/WRITE rpc, MDS will reject the
READ/WRITE operation. Therefore we need to recoalesce them before
re-sending to MDS.

>
> See if you can advance your thread-for-blocks-objects
> patch to current code and inject some errors. I think
> it will work this time. What do you think?
The thread-for-block-objects patch is to solve the default workqueue
deadlock problem. But it can't solve the too large IO size for MDS
problem. So I think we need all of the three patches to handle LD IO
failures.

Thanks,
Tao
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html