Re: git regression failures with v6.2-rc NFS client

Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> · Sat, 4 Feb 2023 16:52:29 +0000

> On Feb 4, 2023, at 08:15, Benjamin Coddington <bcodding@xxxxxxxxxx> wrote:
> 
> On 4 Feb 2023, at 6:07, Thorsten Leemhuis wrote:
> 
>> But as you said: people are more likely to run into this problem now.
>> This in the end makes the kernel worse and thus afaics is a regression,
>> as Hugh mentioned.
>> 
>> There sadly is no quote from Linus in
>> https://docs.kernel.org/process/handling-regressions.html
>> that exactly matches and helps in this scenario, but a few that come
>> close; one of them:
>> 
>> ```
>> Because the only thing that matters IS THE USER.
>> 
>> How hard is that to understand?
>> 
>> Anybody who uses "but it was buggy" as an argument is entirely missing
>> the point. As far as the USER was concerned, it wasn't buggy - it
>> worked for him/her.
>> ```
>> 
>> Anyway, I guess we get close to the point where I simply explicitly
>> mention the issue in my weekly regression report, then Linus can speak
>> up himself if he wants. No hard feeling here, I think that's just my duty.
>> 
>> BTW, I CCed the regression list, as it should be in the loop for
>> regressions per
>> https://docs.kernel.org/admin-guide/reporting-regressions.html]
>> 
>> BTW, Benjamin, you earlier in this thread mentioned:
>> 
>> ```
>> Thorsten's bot is just scraping your regression report email, I doubt
>> they've carefully read this thread.
>> ```
>> 
>> Well, kinda. It's just not the bot that adds the regression to the
>> tracking, that's me doing it. But yes, I only skim threads and sometimes
>> simply when adding lack knowledge or details to decide if something
>> really is a regression or not. But often that sooner or later becomes
>> clear -- and then I'll remove an issue from the tracking, if it turns
>> out it isn't a regression.
>> 
>> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
> 
> Ah, thanks for explaining that.
> 
> I'd like to summarize and quantify this problem one last time for folks that
> don't want to read everything.  If an application wants to remove all files
> and the parent directory, and uses this pattern to do it:
> 
> opendir
> while (getdents)
>    unlink dents
> closedir
> rmdir
> 
> Before this commit, that would work with up to 126 dentries on NFS from
> tmpfs export.  If the directory had 127 or more, the rmdir would fail with
> ENOTEMPTY.

For all sizes of filenames, or just the particular set that was chosen here? What about the choice of rsize? Both these values affect how many entries glibc can cache before it has to issue another getdents() call into the kernel. For the record, this is what glibc does in the opendir() code in order to choose a buffer size for the getdents syscalls:

  /* The st_blksize value of the directory is used as a hint for the
     size of the buffer which receives struct dirent values from the
     kernel.  st_blksize is limited to max_buffer_size, in case the
     file system provides a bogus value.  */
  enum { max_buffer_size = 1048576 };

  enum { allocation_size = 32768 };
  _Static_assert (allocation_size >= sizeof (struct dirent64),
                  "allocation_size < sizeof (struct dirent64)");

  /* Increase allocation if requested, but not if the value appears to
     be bogus.  It will be between 32Kb and 1Mb.  */
  size_t allocation = MIN (MAX ((size_t) statp->st_blksize, (size_t)
                                allocation_size), (size_t) max_buffer_size);

  DIR *dirp = (DIR *) malloc (sizeof (DIR) + allocation);

> 
> After this commit, it only works with up to 17 dentries.
> 
> The argument that this is making things worse takes the position that there
> are more directories in the universe with >17 dentries that want to be
> cleaned up by this "saw off the branch you're sitting on" pattern than
> directories with >127.  And I guess that's true if Chuck runs that testing
> setup enough.  :)
> 
> We can change the optimization in the commit from
> NFS_READDIR_CACHE_MISS_THRESHOLD + 1
> to
> nfs_readdir_array_maxentries + 1
> 
> This would make the regression disappear, and would also keep most of the
> optimization.
> 
> Ben
> 

So in other words the suggestion is to optimise the number of readdir records that we return from NFS to whatever value that papers over the known telldir()/seekdir() tmpfs bug that is re-revealed by this particular test when run under these particular conditions? Anyone who tries to use tmpfs with a different number of files, different file name lengths, or different mount options is still SOL because that’s not a “regression"?

_________________________________
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@xxxxxxxxxxxxxxx