> -----Original Message----- > From: Boaz Harrosh [mailto:bharrosh@xxxxxxxxxxx] > Sent: Monday, August 22, 2011 7:52 PM > To: Peng Tao > Cc: Benny Halevy; linux-nfs@xxxxxxxxxxxxxxx; Peng Tao; Myklebust, > Trond; Isaman, Fred > Subject: Re: [PATCH] pnfsblock: init pg_bsize properly > > On 08/17/2011 02:35 AM, Peng Tao wrote: > > Hi, Benny and Boaz, > > > <snip> > > > In pnfs_do_multiple_reads/pnfs_do_multiple_writes, data->mds_ops will > > be set as desc->pg_rpc_callops, which is determined in > > nfs_generic_flush/nfs_generic_pagein according to desc->pg_bsize. For > > blocklayout, we wouldn't want to set data->mds_ops to > > partial_read/write ops, so I write the patch to use lseg length as > > pg_bsize. > > > > Do you mean in the case where MDS sets (pg_bsize < PAGE_SIZE) ? > > Right, that is a problem. (Theoretically, because the pNFSD-Linux > server > does not do that. Do you have a Server that does?) > > > LD can override pg_bsize in pg_init because > > nfs_pageio_reset_read_mds/nfs_pageio_reset_write_mds will reset it to > > server rsize/wsize if pnfs is not tried. > > > > So if it is the "pg_bsize < PAGE_SIZE" but pNFS-IO case then I don't > like your patch, at all. We should fix the generic code to behave > properly, and not let LDs hack their way out. (For example what about > objects and files LDs) > > There is a few ways you can fix the generic code. One is override the > desc->pg_rpc_callops for the pNFS case to always be the same one. Or > override the test for (pg_bsize < PAGE_SIZE) in the pNFS case if we > have > a lseg. Or some other clean way. > > But please don't fix it like that, inside each LD driver. > > [ Trond Fred > One thing I do not understand about the files-layout operations. You > have explained in the passed that r/wsize sent from the MDS is also > the > same one for each DS. So if we take an example of rsize beeing 2MB > and there is a stripping of 2 DS for that layout.(Say > strip_unit==rsize) > Then we need to read 1/2 of that page from one DS and the 2/2 half > from the > second. Will current partial_read/write work if going through files- > LD? > ] No. The stripe size may be smaller than the r/wsize, in which case we're in the same boat as the blocks and objects. Cheers Trond ��.n��������+%������w��{.n�����{��w���jg��������ݢj����G�������j:+v���w�m������w�������h�����٥