Re: [PATCH] pnfsblock: init pg_bsize properly

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi, Trond and Boaz,

On Tue, Aug 23, 2011 at 8:00 AM, Myklebust, Trond
<Trond.Myklebust@xxxxxxxxxx> wrote:
>> -----Original Message-----
>> From: Boaz Harrosh [mailto:bharrosh@xxxxxxxxxxx]
>> Sent: Monday, August 22, 2011 7:52 PM
>> To: Peng Tao
>> Cc: Benny Halevy; linux-nfs@xxxxxxxxxxxxxxx; Peng Tao; Myklebust,
>> Trond; Isaman, Fred
>> Subject: Re: [PATCH] pnfsblock: init pg_bsize properly
>>
>> On 08/17/2011 02:35 AM, Peng Tao wrote:
>> > Hi, Benny and Boaz,
>> >
>> <snip>
>>
>> > In pnfs_do_multiple_reads/pnfs_do_multiple_writes, data->mds_ops will
>> > be set as desc->pg_rpc_callops, which is determined in
>> > nfs_generic_flush/nfs_generic_pagein according to desc->pg_bsize. For
>> > blocklayout, we wouldn't want to set data->mds_ops to
>> > partial_read/write ops, so I write the patch to use lseg length as
>> > pg_bsize.
>> >
>>
>> Do you mean in the case where MDS sets (pg_bsize < PAGE_SIZE) ?
>>
>> Right, that is a problem. (Theoretically, because the pNFSD-Linux
>> server
>> does not do that. Do you have a Server that does?)
No, I don't have a server does that. But it is a server config option
and we can't force users not to change it. So better fix it at client
side.

>>
>> > LD can override pg_bsize in pg_init because
>> > nfs_pageio_reset_read_mds/nfs_pageio_reset_write_mds will reset it to
>> > server rsize/wsize if pnfs is not tried.
>> >
>>
>> So if it is the "pg_bsize < PAGE_SIZE" but pNFS-IO case then I don't
>> like your patch, at all. We should fix the generic code to behave
>> properly, and not let LDs hack their way out. (For example what about
>> objects and files LDs)
>>
>> There is a few ways you can fix the generic code. One is override the
>> desc->pg_rpc_callops for the pNFS case to always be the same one. Or
>> override the test for (pg_bsize < PAGE_SIZE) in the pNFS case if we
>> have
>> a lseg. Or some other clean way.
I was under the impression that for object and file layouts, partial
read/write rpc ops are still needed for DS IO when DS r/wsize is
smaller than PAGE_SIZE...

>>
>> But please don't fix it like that, inside each LD driver.
>>
>> [ Trond Fred
>>   One thing I do not understand about the files-layout operations. You
>>   have explained in the passed that r/wsize sent from the MDS is also
>> the
>>   same one for each DS. So if we take an example of rsize beeing 2MB
>>   and there is a stripping of 2 DS for that layout.(Say
>> strip_unit==rsize)
>>   Then we need to read 1/2 of that page from one DS and the 2/2 half
>> from the
>>   second. Will current partial_read/write work if going through files-
>> LD?
>> ]
>
> No. The stripe size may be smaller than the r/wsize, in which case we're in the same boat as the blocks and objects.
So this is a generic issue. For file and object layout, do you need to
use partial read/write rpc ops in any case? For block layout, we would
like to never use it in LD. But I'm not sure about file and object
case. Could you confirm?

Thanks,
Tao
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux