Re: knfsd server bug when GETATTR follows READDIR

Cedric Blancher <cedric.blancher@xxxxxxxxx> · Tue, 24 Dec 2024 07:51:00 +0100

On Sun, 22 Dec 2024 at 21:19, Chuck Lever <chuck.lever@xxxxxxxxxx> wrote:
>
> On 12/21/24 6:53 PM, Rick Macklem wrote:
> > On Sat, Dec 21, 2024 at 3:27 PM Rick Macklem <rick.macklem@xxxxxxxxx> wrote:
> >>
> >> On Sat, Dec 21, 2024 at 9:34 AM Chuck Lever <chuck.lever@xxxxxxxxxx> wrote:
> >>>
> >>> On 12/20/24 9:16 PM, J David wrote:
> >>>> Hello,
> >>>>
> >>>> On Tue, Dec 17, 2024 at 8:51 PM Chuck Lever <chuck.lever@xxxxxxxxxx> wrote:
> >>>>> If they can reproduce
> >>>>> this issue with an "in tree" file system contained in a recent upstream
> >>>>> Linux kernel, then we can take a look. (Or you and J. David can give it
> >>>>> a try).
> >>>>
> >>>> Yes, I reproduced this behavior on ext4 with 6.11.5+bpo-amd64 from
> >>>> Debian backports on completely different hardware.
> >>>>
> >>>> Then I set up another NFS server on Arch (running kernel 6.12.4), and
> >>>> reproduced the issue there as well.
> >>>>
> >>>> Then, just to be sure, I went and found the instructions for building
> >>>> the Linux kernel from source, built and tested both 6.12.6 and
> >>>> 6.13-rc3 as downloaded directly from www.kernel.org, and the issue
> >>>> occurs with those as well.
> >>>
> >>> Reproducing on v6.13-rc with ext4 is all that was necessary, thank you!
> >>>
> >>>
> >>>> Additionally, I have tested every combination of FreeBSD, Linux and
> >>>> OpenIndiana as client and server to confirm that FreeBSD client with
> >>>> Linux server is the only case where this problem occurs.
> >>>
> >>> Interesting.
> >>>
> >>>
> >>>> Does this count as reproducing the issue with an "in tree" file system
> >>>> contained in a recent upstream Linux kernel? I'm asking sincerely; I'm
> >>>> so far out of my depth that I'm pretty sure there are sea monsters
> >>>> swimming around down there. So I can't rule out the possibility that
> >>>> I've done something wrong either in setup or testing.
> >>>>
> >>>> During the course of this, I've gotten the reproduction down to
> >>>> extracting a 2k tar file and then running "du" on the resulting
> >>>> directory from the client. Doesn't matter if the file is untarred on
> >>>> the FreeBSD client, the server, or another client. The tar file
> >>>> contains a directory with a handful of random Javascript files from
> >>>> Drupal. As far as I can tell, it has something to do with the number,
> >>>> size, or names of the files. The Drupal project has three separate
> >>>> directories all structured like this with the same filenames, but the
> >>>> file contents vary. The issue occurs with all of them.
> >>>>
> >>>> The Linux /etc/exports file is just:
> >>>>
> >>>> /data 192.168.201.0/24(rw,sync)
> >>>>
> >>>> (The production case also uses crossmnt and no_subtree_check, anonuid,
> >>>> and anongid, but I eliminated those one by one to make sure they
> >>>> weren't responsible.)
> >>>>
> >>>> The corresponding fstab entry on the FreeBSD 14.2-RELEASE client is:
> >>>>
> >>>> 192.168.201.200:/data /data nfs rw,tcp,nfsv4,minorversion=2 0 0
> >>>
> >>> Out of curiosity, do you see the problem recur with nfsv3 or the other
> >>> NFSv4 minor versions?
> >>>
> >>>
> >>>> One additional thing I noticed that really blew my mind is that I can
> >>>> shutdown both the client and the server, wait, power them back on, and
> >>>> the issue is still there. So it's not something in RAM.  That prompted
> >>>> me to try "touch x" in the directory to create a new 0-length file.
> >>>> The issue then goes away. Then I can "rm x" and the issue comes back.
> >>>> By contrast, I can write megabytes from /dev/random into one of the
> >>>> files without affecting anything; the issue stays the same.
> >>>>
> >>>> I then tried it with all empty files using the same filenames. The
> >>>> issue still occurred. Add or remove one file and the issue goes away.
> >>>> I then renamed one of the files to zz.js. Issue still occurs. Renamed
> >>>> it to zzz.js. Problem still occurs. Kept going until I got to
> >>>> zzzzzz.js and it worked.
> >>>>
> >>>> Finally, I got it to the point where running this in an empty mounted
> >>>> directory will create the issue:
> >>>>
> >>>> rm *.xx; for a in a b c d e f g h ; do for b in 1 2 3 4 5 6 7 ; do
> >>>> touch $a$b.xx ; done; done; for a in 1 2 3 4 5; do touch x$a-xx.xx;
> >>>> done; touch y0-xxxxxx.xx
> >>>>
> >>>> and this will not:
> >>>>
> >>>> rm *.xx; for a in a b c d e f g h ; do for b in 1 2 3 4 5 6 7 ; do
> >>>> touch $a$b.xx ; done; done; for a in 1 2 3 4 5; do touch x$a-xx.xx;
> >>>> done; touch y0-xxxxxxx.xx
> >>>>
> >>>> (The difference being one extra x in the last filename.)
> >>>>
> >>>> It works in the other direction as well. This causes the issue:
> >>>>
> >>>> rm *.xx; for a in a b c d e f g h ; do for b in 1 2 3 4 5 6 7 ; do
> >>>> touch $a$b.xx ; done; done; for a in 1 2 3 4 5; do touch x$a-xx.xx;
> >>>> done; touch y0-xxx.xx
> >>>>
> >>>> This does not:
> >>>>
> >>>> rm *.xx; for a in a b c d e f g h ; do for b in 1 2 3 4 5 6 7 ; do
> >>>> touch $a$b.xx ; done; done; for a in 1 2 3 4 5; do touch x$a-xx.xx;
> >>>> done; touch y0-xx.xx
> >>>>
> >>>> There's a four-character window involving the length of the filenames
> >>>> where 62 files in a directory causes this issue. There's a little more
> >>>> to it than that; it doesn't look like you can just create 61
> >>>> two-letter filenames and then one really long one and get the issue.
> >>>>
> >>>> So I haven't found the specifics yet, but perhaps due to pure chance
> >>>> this directory structure is exactly right to provoke an incredibly
> >>>> obscure edge case?
> >>>
> >>> Well it's likely that this is a problem with READDIR, so file content
> >>> is not going to be an issue. The file name lengths are the problem.
> >>>
> >>> Also, I'm wondering what the FreeBSD client's directory readdir
> >>> arguments are (how much does it request, what are the maximum limits it
> >>> negotiates, and so on). Rick?
> >> As you'll see in the packet trace:
> >> Sequence: cache this: No
> >> Putfh: directory fh
> >> Readdir:
> >>      cookie: 0
> >>      cookie_verf: 0
> >>      dircount: 8706
> >>      maxcount: 8706
> >>      attr: type, RDattr_error, fileid, mounted_on_fleid
> >> Getattr: same attributes as requested for a previous GETATTR, mainly
> >>                to keep the directory's attribute cache up to date.
> >>
> >> The session negotiates a max request/reply size of just over 1Mbyte and a
> >> maximum of something like 20 ops. (Can't recall, but definitely more than 4.)
> >>
> >> If you are wondering where the 8706 comes from, it was an estimate of how
> >> much would be needed to fill an 8K buffer with the XDR translated to UFS dirents
> >> by adding 512 to 8K.
> >>
> >> I have not yet had a chance to see if I can reproduce the problem with
> >> J. David's
> >> reproducer. I will try that soon, and if I can reproduce it, I will
> >> poke at it to try and
> >> figure out what is going on.
> > Just fyi, I have reproduced it. Once you use J. David's little shell script to
> > create the files in the directory, the Readdir RPC gets the junk reply
> > to GETATTR
> > (the count of words for the attribute bitmap in the reply is 0 instead of 2).
> > You can unmount/remount it and still get the failure, assuming you do not
> > mess with the directory contents.
> >
> > Good work finding the reproducer, J. David!
> >
> > I will start to poke around to see if I can figure out what the knfsd server is
> > doing.
> >
> > Chuck, I suspect any fairly recent FreeBSD client will be sufficient to
> > reproduce this, just in case you are inspired to cross over to the dark
> > side and install FreeBSD somewhere.
>
> I see the same malformed GETATTR result in the attachments.
>
> Linux doesn't trip on this issue because it's NFS client doesn't ever
> append a GETATTR operation after a READDIR.

Windows ms-nfs41-client driver also does GETATTR after READDIR, and
trips over bogus return values on a regular basis. Solaris 11.4 nfsd
and nfs4j do not exhibit such problems.

Another garbage value that client gets from Linux nfsd is
FATTR4_WORD0_CHANGE, which sometimes returns absurdly high values.
Maybe add some WARN_ONCE() to Linux nfsd if unexpected crazy values
are sent over the wire?

Ced
-- 
Cedric Blancher <cedric.blancher@xxxxxxxxx>
[https://plus.google.com/u/0/+CedricBlancher/]
Institute Pasteur