I tested and you're right, it sends a GETATTR flood on the second `ls -la`. I should have tested more than a single scenario. On Thu, Jul 18, 2019 at 4:55 PM Benjamin Coddington <bcodding@xxxxxxxxxx> wrote: > > The one thing we'd need to be careful about here is if there are > existing > cached dentries without the attributes we want, then we run through > nfs_advise_use_readdirplus(), setting the flag, then the > nfs_force_readdirplus() fails to drop all the existing dentries.. > > That looks like a `ls <dir>`, then a stat of one of those > cached dentries, then an `ls -la` would look like a flood of GETATTR for > every entry instead of a series of READDIRPLUS. > > Any chance you still have that test system? What happens if you do: > > ls <big dir> > stat <big dir>/file > ls -la <big dir> > > I'm worried that will leave you stuck in the GETATTR flood instead of > using > the more optimal READDIRPLUS. > > If I'm going to send this patch, I should also test it, so I suppose I > might > just build and test this myself.. > > Ben > > On 18 Jul 2019, at 3:26, Noam Lewis wrote: > > > The patch seems to work nicely. getdents() doesn't get stuck / > > iteration does not restart when accessing a non-cached file. > > > > Thanks! > > > > Can we agree this is the more correct behavior? > > What's the next step? > > > > On Wed, Jul 17, 2019 at 3:08 PM Benjamin Coddington > > <bcodding@xxxxxxxxxx> wrote: > >> > >> Maybe because we always drop the directory's page cache whenever we > >> decide > >> to switch to READDIRPLUS, even if we're already doing it.. I've not > >> tested > >> or checked very thoroughly if this is ok, but I think maybe want > >> something > >> like this: > >> > >> diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c > >> index 57b6a45576ad..acfe47668238 100644 > >> --- a/fs/nfs/dir.c > >> +++ b/fs/nfs/dir.c > >> @@ -438,8 +438,8 @@ void nfs_force_use_readdirplus(struct inode *dir) > >> struct nfs_inode *nfsi = NFS_I(dir); > >> > >> if (nfs_server_capable(dir, NFS_CAP_READDIRPLUS) && > >> - !list_empty(&nfsi->open_files)) { > >> - set_bit(NFS_INO_ADVISE_RDPLUS, &nfsi->flags); > >> + !list_empty(&nfsi->open_files) && > >> + !test_and_set_bit(NFS_INO_ADVISE_RDPLUS, > >> &nfsi->flags)) > >> { > >> invalidate_mapping_pages(dir->i_mapping, 0, -1); > >> } > >> } > >> > >> Want to give that a spin? > >> > >> Ben > >> > >> On 17 Jul 2019, at 2:44, Noam Lewis wrote: > >> > >>> I'm starting to think this is a bug. I can't see a good reason why > >>> accessing (stat) a directory entry should cause the READDIRPLUS > >>> cookie > >>> to be reset. > >>> > >>> It seems that the trigger for iteration reset is accessing a > >>> directory > >>> entry that doesn't have a valid entry in the cache. If it does has a > >>> valid cache entry it doesn't trigger the cookie reset. Note, it > >>> doesn't matter if the entry (or traversed dir) has actually changed: > >>> the reset occurs even if both did not change on the server side. > >>> > >>> Setting actimeo to a large enough value allows the dir iteration to > >>> complete without any resets, but this is just a workaround that > >>> isn't > >>> acceptable if the file system is being modified or if there isn't > >>> enough memory. It's also heuristic and can lead to unexpected > >>> hiccups > >>> if something in the environment changes. > >>> > >>> So my questions still stand: is this expected behavior? What's the > >>> reason? > >>> > >>> P.S. I'm using NFSv3 > >>> > >>> On Mon, Jul 15, 2019, 08:56 Noam Lewis <noam.lewis@xxxxxxxxxxxxxx> > >>> wrote: > >>>> > >>>> I've encountered a problem while iterating large directories via an > >>>> NFS mount. > >>>> > >>>> Scenario: > >>>> > >>>> 1. Linux NFS client iterates a directory with many (millions) of > >>>> files, e.g. via getdents() until all entries are done. In my case, > >>>> READDIRPLUS is being used under the hood. Trivial reproduction is > >>>> to > >>>> run: ls -la > >>>> 2. At the same time, run the stat tool on a file inside that > >>>> directory. > >>>> > >>>> The directory on the server is not being modified anywhere (on this > >>>> client or any other client). > >>>> > >>>> Result: the next or ongoing getdents will get stuck for a long time > >>>> (tens of seconds to minutes). It appears to be re-iterating some of > >>>> the work it already did, by going back to a previous NFS > >>>> READDIRPLUS > >>>> cookie. > >>>> > >>>> > >>>> Things I've tried as workarounds: > >>>> - Mounting with nordirplus - the iteration doesn't seem to reset or > >>>> at > >>>> least getdents doesn't get stuck, but now I have tons of LOOKUPs, > >>>> as > >>>> expected. > >>>> - Setting actimeo=(large number) doesn't affect the behavior > >>>> > >>>> Questions: > >>>> 1. Why does the stat command cause this? > >>>> 2. How can I avoid the reset, i.e. ensure forward progress of the > >>>> dir > >>>> iteration?