RE: [PATCH] pnfsblock: init pg_bsize properly

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> -----Original Message-----
> From: Boaz Harrosh [mailto:bharrosh@xxxxxxxxxxx]
> Sent: Monday, August 22, 2011 7:52 PM
> To: Peng Tao
> Cc: Benny Halevy; linux-nfs@xxxxxxxxxxxxxxx; Peng Tao; Myklebust,
> Trond; Isaman, Fred
> Subject: Re: [PATCH] pnfsblock: init pg_bsize properly
> 
> On 08/17/2011 02:35 AM, Peng Tao wrote:
> > Hi, Benny and Boaz,
> >
> <snip>
> 
> > In pnfs_do_multiple_reads/pnfs_do_multiple_writes, data->mds_ops will
> > be set as desc->pg_rpc_callops, which is determined in
> > nfs_generic_flush/nfs_generic_pagein according to desc->pg_bsize. For
> > blocklayout, we wouldn't want to set data->mds_ops to
> > partial_read/write ops, so I write the patch to use lseg length as
> > pg_bsize.
> >
> 
> Do you mean in the case where MDS sets (pg_bsize < PAGE_SIZE) ?
> 
> Right, that is a problem. (Theoretically, because the pNFSD-Linux
> server
> does not do that. Do you have a Server that does?)
> 
> > LD can override pg_bsize in pg_init because
> > nfs_pageio_reset_read_mds/nfs_pageio_reset_write_mds will reset it to
> > server rsize/wsize if pnfs is not tried.
> >
> 
> So if it is the "pg_bsize < PAGE_SIZE" but pNFS-IO case then I don't
> like your patch, at all. We should fix the generic code to behave
> properly, and not let LDs hack their way out. (For example what about
> objects and files LDs)
> 
> There is a few ways you can fix the generic code. One is override the
> desc->pg_rpc_callops for the pNFS case to always be the same one. Or
> override the test for (pg_bsize < PAGE_SIZE) in the pNFS case if we
> have
> a lseg. Or some other clean way.
> 
> But please don't fix it like that, inside each LD driver.
> 
> [ Trond Fred
>   One thing I do not understand about the files-layout operations. You
>   have explained in the passed that r/wsize sent from the MDS is also
> the
>   same one for each DS. So if we take an example of rsize beeing 2MB
>   and there is a stripping of 2 DS for that layout.(Say
> strip_unit==rsize)
>   Then we need to read 1/2 of that page from one DS and the 2/2 half
> from the
>   second. Will current partial_read/write work if going through files-
> LD?
> ]

No. The stripe size may be smaller than the r/wsize, in which case we're in the same boat as the blocks and objects.

Cheers
  Trond
��.n��������+%������w��{.n�����{��w���jg��������ݢj����G�������j:+v���w�m������w�������h�����٥



[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux