On Wed, 2012-05-02 at 08:39 +1000, NeilBrown wrote: > On Tue, 1 May 2012 18:06:23 -0400 Trond Myklebust > <Trond.Myklebust@xxxxxxxxxx> wrote: > > > While the use of READDIRPLUS is significantly more efficient than > > READDIR followed by many LOOKUP calls, it is still less efficient > > than just READDIR if the attributes are not required. > > > > This patch tracks when lookups are attempted on the directory, > > and uses that information to selectively disable READDIRPLUS > > on that directory. > > The first 'readdir' call is always served using READDIRPLUS. > > Subsequent calls only use READDIRPLUS if there was a successful > > lookup or revalidation on a child in the mean time. > > > > Credit for the original idea should go to Neil Brown. See: > > http://www.spinics.net/lists/linux-nfs/msg19996.html > > However, the implementation in this patch differs from Neil's > > in that it focuses on tracking lookups rather than calls to > > stat(). > > > > Signed-off-by: Trond Myklebust <Trond.Myklebust@xxxxxxxxxx> > > Cc: Neil Brown <neilb@xxxxxxx> > > Thanks! Looks good. > Tracking lookups rather than getattr does make sense - more general. Right. It is more in line with the original intention of READDIRPLUS, which is to serve as a GANG_LOOKUP rather than a 'ls -l' substitute... > > @@ -874,7 +897,9 @@ static int nfs_readdir(struct file *filp, void *dirent, filldir_t filldir) > > desc->file = filp; > > desc->dir_cookie = &dir_ctx->dir_cookie; > > desc->decode = NFS_PROTO(inode)->decode_dirent; > > - desc->plus = NFS_USE_READDIRPLUS(inode); > > + desc->plus = 0; > > + if (nfs_use_readdirplus(inode, filp)) > > + desc->plus = 1; > > > > This looks a bit odd to me. > > desc->plus = nfs_use_readdirplus(inode, filp): > ?? > Or if you think there is a type conflict: > desc->plus = nfs_use_readdirplus(inode, file) ? 1 : 0; desc->plus is a bit field, so the compiler will be transforming the above anyway. I therefore figure we just want whatever is easiest on the eye. > But maybe you prefer the original - your choice of course. > > Also, the current code will clear NFS_INO_ADVISE_RDPLUS if it gets -ETOOSMALL, > and it will stay cleared until the inode falls out of cache and is reloaded. > The new code will set it again a lot sooner. Is that likely to be a problem > (I don't remember what conditions lead to ETOOSMALL)? > In fact I think it will ignore the ETOOSMALL and always retry with > readdirplus for the first read in a directory. That doesn't seem good ? ETOOSMALL means that there isn't enough buffer space to fill a full record. Given that we now provide > 32k buffer space, that seems like it would be extremely difficult to produce. Even a Linux server, with its 4k total limit should be able to provide at least 1 readdirplus record. If we do find a directory that keeps triggering the ETOOSMALL, then I suppose we can add a NFS_INO_FORBID_RDPLUS flag to turn readdirplus off permanently for that directory. However until I find that particular beast, then I'd be inclined to wait. -- Trond Myklebust Linux NFS client maintainer NetApp Trond.Myklebust@xxxxxxxxxx www.netapp.com ��.n��������+%������w��{.n�����{��w���jg��������ݢj����G�������j:+v���w�m������w�������h�����٥