Re: [PATCH 09/11] NFS: Improve performance of listing directories being modified

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Nov 2, 2020 at 11:22 AM Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> wrote:
>
> On Mon, 2020-11-02 at 08:50 -0500, Dave Wysochanski wrote:
> > A process can hang forever to 'ls -l' a directory while the directory
> > is being modified such as another NFS client adding files to the
> > directory.  The problem is seen specifically with larger directories
> > (I tested with 1 million) and/or slower NFS server responses to
> > READDIR.  If a combination of the NFS directory size, the NFS server
> > responses to READDIR is such that the 'ls' process gets partially
> > through the listing before the attribute cache expires (time
> > exceeds acdirmax), we drop the pagecache and have to re-fill it,
> > and as a result, the process may never complete.  One could argue
> > for larger directories the acdirmin/acdirmax should be increased,
> > but it's not always possible to tune this effectively.
> >
> > The root cause of this problem is due to how the NFS readdir cache
> > currently works.  The main search function,
> > readdir_search_pagecache(),
> > always starts searching at page_index and cookie == 0, and for any
> > page not in the cache, fills in the page with entries obtained in
> > a READDIR NFS call.  If a page already exists, we proceed to
> > nfs_readdir_search_for_cookie(), which searches for the cookie
> > (pos) of the readdir call.  The search is O(n), where n is the
> > directory size before the cookie in question is found, and every
> > entry to nfs_readdir() pays this penalty, irrespective of the
> > current directory position (dir_context.pos).  The search is
> > expensive due to the opaque nature of readdir cookies, and the fact
> > that no mapping (hash) exists from cookies to pages.  In the case
> > of a directory being modified, the above behavior can become an
> > excessive penalty, since the same process is forced to fill pages it
> > may be no longer interested in (the entries were passed in a previous
> > nfs_readdir call), and this can essentially lead no forward progress.
> >
> > To fix this problem, at the end of nfs_readdir(), save the page_index
> > corresponding to the directory position (cookie) inside the process's
> > nfs_open_dir_context.  Then at the next entry of nfs_readdir(), use
> > the saved page_index as the starting search point rather than
> > starting
> > at page_index == 0.  Not only does this fix the problem of listing
> > a directory being modified, it also significantly improves
> > performance
> > in the unmodified case since no extra search penalty is paid at each
> > entry to nfs_readdir().
> >
> > In the case of lseek, since there is no hash or other mapping from a
> > cookie value to the page->index, just reset
> > nfs_open_dir_context.page_index
> > to 0, which will reset the search to the old behavior.
> >
> > Signed-off-by: Dave Wysochanski <dwysocha@xxxxxxxxxx>
> > ---
> >  fs/nfs/dir.c           | 8 +++++++-
> >  include/linux/nfs_fs.h | 1 +
> >  2 files changed, 8 insertions(+), 1 deletion(-)
> >
> > diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
> > index 52e06c8fc7cd..b266f505b521 100644
> > --- a/fs/nfs/dir.c
> > +++ b/fs/nfs/dir.c
> > @@ -78,6 +78,7 @@ static struct nfs_open_dir_context
> > *alloc_nfs_open_dir_context(struct inode *dir
> >                 ctx->attr_gencount = nfsi->attr_gencount;
> >                 ctx->dir_cookie = 0;
> >                 ctx->dup_cookie = 0;
> > +               ctx->page_index = 0;
> >                 ctx->cred = get_cred(cred);
> >                 spin_lock(&dir->i_lock);
> >                 if (list_empty(&nfsi->open_files) &&
> > @@ -763,7 +764,7 @@ int
> > find_and_lock_cache_page(nfs_readdir_descriptor_t *desc)
> >         return res;
> >  }
> >
> > -/* Search for desc->dir_cookie from the beginning of the page cache
> > */
> > +/* Search for desc->dir_cookie starting at desc->page_index */
> >  static inline
> >  int readdir_search_pagecache(nfs_readdir_descriptor_t *desc)
> >  {
> > @@ -885,6 +886,8 @@ static int nfs_readdir(struct file *file, struct
> > dir_context *ctx)
> >                 .ctx = ctx,
> >                 .dir_cookie = &dir_ctx->dir_cookie,
> >                 .plus = nfs_use_readdirplus(inode, ctx),
> > +               .page_index = dir_ctx->page_index,
> > +               .last_cookie = nfs_readdir_use_cookie(file) ? ctx-
> > >pos : 0,
> >         },
> >                         *desc = &my_desc;
> >         int res = 0;
> > @@ -938,6 +941,7 @@ static int nfs_readdir(struct file *file, struct
> > dir_context *ctx)
> >  out:
> >         if (res > 0)
> >                 res = 0;
> > +       dir_ctx->page_index = desc->page_index;
> >         trace_nfs_readdir_exit(inode, ctx->pos, dir_ctx->dir_cookie,
> >                                NFS_SERVER(inode)->dtsize,
> > my_desc.plus, res);
> >         return res;
> > @@ -975,6 +979,8 @@ static loff_t nfs_llseek_dir(struct file *filp,
> > loff_t offset, int whence)
> >                 else
> >                         dir_ctx->dir_cookie = 0;
> >                 dir_ctx->duped = 0;
> > +               /* Force readdir_search_pagecache to start over */
> > +               dir_ctx->page_index = 0;
> >         }
> >         inode_unlock(inode);
> >         return offset;
> > diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h
> > index a2c6455ea3fa..0e55c0154ccd 100644
> > --- a/include/linux/nfs_fs.h
> > +++ b/include/linux/nfs_fs.h
> > @@ -93,6 +93,7 @@ struct nfs_open_dir_context {
> >         __u64 dir_cookie;
> >         __u64 dup_cookie;
> >         signed char duped;
> > +       unsigned long   page_index;
> >  };
> >
> >  /*
>
> NACK. It makes no sense to store the page index as a cursor.
>

A similar thing was done recently with:
227823d2074d nfs: optimise readdir cache page invalidation




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux