On Sat, Nov 7, 2020 at 9:14 AM <trondmy@xxxxxxxxxx> wrote: > > From: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx> > > If the directory is changing, causing the page cache to get invalidated > while we are listing the contents, then the NFS client is currently forced > to read in the entire directory contents from scratch, because it needs > to perform a linear search for the readdir cookie. While this is not > an issue for small directories, it does not scale to directories with > millions of entries. > In order to be able to deal with large directories that are changing, > add a heuristic to ensure that if the page cache is empty, and we are > searching for a cookie that is not the zero cookie, we just default to > performing uncached readdir. > > Signed-off-by: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx> > --- > fs/nfs/dir.c | 17 +++++++++++++++++ > 1 file changed, 17 insertions(+) > > diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c > index 238872d116f7..d7a9efd31ecd 100644 > --- a/fs/nfs/dir.c > +++ b/fs/nfs/dir.c > @@ -917,11 +917,28 @@ static int find_and_lock_cache_page(struct nfs_readdir_descriptor *desc) > return res; > } > > +static bool nfs_readdir_dont_search_cache(struct nfs_readdir_descriptor *desc) > +{ > + struct address_space *mapping = desc->file->f_mapping; > + struct inode *dir = file_inode(desc->file); > + unsigned int dtsize = NFS_SERVER(dir)->dtsize; > + loff_t size = i_size_read(dir); > + > + /* > + * Default to uncached readdir if the page cache is empty, and > + * we're looking for a non-zero cookie in a large directory. > + */ > + return desc->dir_cookie != 0 && mapping->nrpages == 0 && size > dtsize; > +} > + > /* Search for desc->dir_cookie from the beginning of the page cache */ > static int readdir_search_pagecache(struct nfs_readdir_descriptor *desc) > { > int res; > > + if (nfs_readdir_dont_search_cache(desc)) > + return -EBADCOOKIE; > + > do { > if (desc->page_index == 0) { > desc->current_index = 0; > -- > 2.28.0 > I did a lot of testing yesterday and last night and this mostly behaves as designed. However, before you sent this I was starting to test the following patch which adds a NFS_DIR_CONTEXT_UNCACHED flag inside nfs_open_dir_context. I was not sure about the logic when to turn it on, so for now I'd ignore that (especially nrpages > NFS_READDIR_UNCACHED_THRESHOLD). However, I'm curious why: 1. you didn't take the approach of adding a per-process context flag so once a process hits this condition, the process would just shift to uncached and be unaffected by any other process. I wonder about multiple directory listing processes defeating this logic if it's not per-process so we may get an unbounded time still 2. you put the logic inside readdir_search_pagecache rather than inside the calling do { .. } while loop commit a56ff638fe696929a1bc633b22e2d9bd05f3c308 Author: Dave Wysochanski <dwysocha@xxxxxxxxxx> Date: Fri Nov 6 08:32:41 2020 -0500 NFS: Use uncached readdir if we drop the pagecache with larger directories Larger directories can get into problem where they do not make forward progress once the pagecache times out via exceeding acdirmax. Alleviate this problem by shifting to uncached readdir if we drop the pagecache on larger directory. Signed-off-by: Dave Wysochanski <dwysocha@xxxxxxxxxx> diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c index ca30e2dbb9c3..7f43f75d5b76 100644 --- a/fs/nfs/dir.c +++ b/fs/nfs/dir.c @@ -78,6 +78,7 @@ static struct nfs_open_dir_context *alloc_nfs_open_dir_context(struct inode *di ctx->attr_gencount = nfsi->attr_gencount; ctx->dir_cookie = 0; ctx->dup_cookie = 0; + ctx->flags = 0; spin_lock(&dir->i_lock); if (list_empty(&nfsi->open_files) && (nfsi->cache_validity & NFS_INO_DATA_INVAL_DEFER)) @@ -1023,6 +1024,8 @@ static int nfs_readdir(struct file *file, struct dir_context *ctx) struct nfs_open_dir_context *dir_ctx = file->private_data; struct nfs_readdir_descriptor *desc; int res; + unsigned long nrpages; +#define NFS_READDIR_UNCACHED_THRESHOLD 1024 dfprintk(FILE, "NFS: readdir(%pD2) starting at cookie %llu\n", file, (long long)ctx->pos); @@ -1035,9 +1038,25 @@ static int nfs_readdir(struct file *file, struct dir_context *ctx) * revalidate the cookie. */ if (ctx->pos == 0 || nfs_attribute_cache_expired(inode)) { + nrpages = inode->i_mapping->nrpages; res = nfs_revalidate_mapping(inode, file->f_mapping); if (res < 0) goto out; + /* + * If we just dropped the pagecache, and we're not + * at the start of the directory, use uncached. + */ + if (!test_bit(NFS_DIR_CONTEXT_UNCACHED, &dir_ctx->flags) && + ctx->pos != 0 && + !inode->i_mapping->nrpages && + nrpages > NFS_READDIR_UNCACHED_THRESHOLD) { + set_bit(NFS_DIR_CONTEXT_UNCACHED, &dir_ctx->flags); + printk("NFS: DBG setting NFS_DIR_CONTEXT_UNCACHED ctx->pos = %lld nrpages + } + } + if (test_bit(NFS_DIR_CONTEXT_UNCACHED, &dir_ctx->flags) && ctx->pos == 0) { + clear_bit(NFS_DIR_CONTEXT_UNCACHED, &dir_ctx->flags); + printk("NFS: DBG clearing NFS_DIR_CONTEXT_UNCACHED"); } res = -ENOMEM; @@ -1057,7 +1076,10 @@ static int nfs_readdir(struct file *file, struct dir_context *ctx) spin_unlock(&file->f_lock); do { - res = readdir_search_pagecache(desc); + if (test_bit(NFS_DIR_CONTEXT_UNCACHED, &dir_ctx->flags)) + res = -EBADCOOKIE; + else + res = readdir_search_pagecache(desc); if (res == -EBADCOOKIE) { res = 0; diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h index 681ed98e4ba8..fedcfec94d95 100644 --- a/include/linux/nfs_fs.h +++ b/include/linux/nfs_fs.h @@ -98,6 +98,8 @@ struct nfs_open_dir_context { __u64 dir_cookie; __u64 dup_cookie; signed char duped; + unsigned long flags; +#define NFS_DIR_CONTEXT_UNCACHED (1) }; /*