Re: I can't get no readdir satisfaction

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Aug 24, 2016 at 12:18:04PM +0000, Trond Myklebust wrote:
> 
> > On Aug 23, 2016, at 17:21, Benjamin Coddington <bcodding@xxxxxxxxxx> wrote:
> > 
> > 
> > 
> > On 23 Aug 2016, at 11:36, Trond Myklebust wrote:
> > 
> >>> On Aug 23, 2016, at 11:09, Benjamin Coddington <bcodding@xxxxxxxxxx> wrote:
> >>> 
> >>> Hi linux-nfs,
> >>> 
> >>> 311324ad1713 ("NFS: Be more aggressive in using readdirplus for 'ls -l'
> >>> situations") changed when nfs_readdir() decides to revalidate the
> >>> directory's mapping, which contains all the entries.  In addition to just
> >>> checking if the attribute cache has expired, it includes a check to see if
> >>> NFS_INO_INVALID_DATA is set on the directory.
> >>> 
> >>> Well, customers that have directories with very many dentries and that same
> >>> directory's attributes are frequently updated are now grouchy that `ls -l`
> >>> takes so long since any update of the directory causes the mapping to be
> >>> invalidated and we have to start over filling the directory's mapping.
> >>> 
> >>> I actually haven't put real hard thought into it yet (because often for me
> >>> that just wastes a lot of time), so I am doing the lazy thing by asking this
> >>> question:
> >>> 
> >>> Can we go back to just the using the attribute cache timeout, or should we
> >>> get all heuristical about readdir?
> >>> 
> >> 
> >> We are all heuristical at this point. How are the heuristics failing?
> >> 
> >> The original problem those heuristics were designed to solve was that all
> >> the stat() calls took forever to complete, since they are all synchronous;
> >> Tigran showed some very convincing numbers for a large directory where the
> >> difference in performance was an order of magnitude improved by using
> >> readdirplus instead of readdir…
> > 
> > I'll try to present a better explanation.  While `ls -l` is walking through
> > a directory repeatedly entering nfs_readdir(), a CREATE response send us
> > through nfs_post_op_update_inode_locked():
> > 
> > 1531     if (S_ISDIR(inode->i_mode))
> > 1532         invalid |= NFS_INO_INVALID_DATA;
> > 1533     nfs_set_cache_invalid(inode, invalid);
> > 
> > Now, the next entry into nfs_readdir() has us do nfs_revalidate_mapping(),
> > which will do nfs_invalidate_mapping() for the directory, and so we have to
> > start over with cookie 0 sending READDIRPLUS to re-fill the directory's
> > mapping to get back to where we are for the current nfs_readdir().
> > 
> > This process repeats for every entry into nfs_readdir() if the directory
> > keeps getting updated, and it becomes more likely that it will be updated as
> > each pass takes longer and longer to re-fill the mapping as the current
> > nfs_readdir() invocation is further along.
> > 
> > READDIRPLUS isn't the problem, the problem is invalidating the directory
> > mapping in the middle of a series of getdents() if we do a CREATE.  Also, I
> > think a similar thing happens if the directory's mtime or ctime is updated -
> > but in that case we set NFS_INO_INVALID_DATA because the change_attr
> > updates.
> > 
> > So, for directories with a large number of entries that updates often, it can
> > be very slow to list the directory.
> > 
> > Why did 311324ad1713 change nfs_readdir from
> > 
> > if (nfs_attribute_cache_expired(inode))
> >    nfs_revalidate_mapping(inode, file->f_mapping);
> > to
> > 
> > if (nfs_attribute_cache_expired(inode) || nfsi->cache_validity & NFS_INO_INVALID_DATA)
> >    nfs_revalidate_mapping(inode, file->f_mapping);
> 
> As the commit message says, the whole purpose was to use READDIRPLUS as a substitute for multiple GETATTR calls when the heuristic tells us that the user is performing an ‘ls -l’ style of workload.
> 
> > 
> > .. and can we go back to the way it was before?
> 
> Not without slowing down ‘ls -l’ on large directories.
> 
> > 
> > OK.. I understand why -- it is more correct since if we know the directory has
> > changed, we might as well fetch the change.  Otherwise, we might be creating
> > files and then wondering why they aren't listed.
> > 
> > It might be nicer to not invalidate the mapping we're currently using for
> > readdir, though.  Maybe there's a way to keep the mapping for the currently
> > opened directory and invalidate it once it's closed.
> 

>  POSIX requires that you revalidate on opendir() and rewinddir(), and
>  leaves behaviour w.r.t. file addition and removal after the call to
>  opendir()/rewinddir() undefined
>  (http://pubs.opengroup.org/onlinepubs/9699919799/functions/readdir.html).

It is only undefined whether the added or removed entry is returned.
Other entries still need to returned exactly once.

In this case we're restarting the read of the directory from scratch--I
don't understand how that's possible while avoiding skipped or
duplicated entries.

Surely the only safe thing to do is to continue reading using the last
cookie returned from the server.

--b.

>  I believe most filesystems under Linux will ensure that if a file is
>  removed, it is no longer seen in the readdir stream, and a file that
>  is added will be seen unless the readdir cursor has already passed
>  it, so there is that to consider.

> 
> However correctness was not the main issue here.
> 
> N?????r??y????b?X??ǧv?^?)޺{.n?+????{???"??^n?r???z???h?????&???G???h?(?階?ݢj"???m??????z?ޖ???f???h???~?m
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux