Re: knfsd server bug when GETATTR follows READDIR

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/21/24 6:53 PM, Rick Macklem wrote:
On Sat, Dec 21, 2024 at 3:27 PM Rick Macklem <rick.macklem@xxxxxxxxx> wrote:

On Sat, Dec 21, 2024 at 9:34 AM Chuck Lever <chuck.lever@xxxxxxxxxx> wrote:

On 12/20/24 9:16 PM, J David wrote:
Hello,

On Tue, Dec 17, 2024 at 8:51 PM Chuck Lever <chuck.lever@xxxxxxxxxx> wrote:
If they can reproduce
this issue with an "in tree" file system contained in a recent upstream
Linux kernel, then we can take a look. (Or you and J. David can give it
a try).

Yes, I reproduced this behavior on ext4 with 6.11.5+bpo-amd64 from
Debian backports on completely different hardware.

Then I set up another NFS server on Arch (running kernel 6.12.4), and
reproduced the issue there as well.

Then, just to be sure, I went and found the instructions for building
the Linux kernel from source, built and tested both 6.12.6 and
6.13-rc3 as downloaded directly from www.kernel.org, and the issue
occurs with those as well.

Reproducing on v6.13-rc with ext4 is all that was necessary, thank you!


Additionally, I have tested every combination of FreeBSD, Linux and
OpenIndiana as client and server to confirm that FreeBSD client with
Linux server is the only case where this problem occurs.

Interesting.


Does this count as reproducing the issue with an "in tree" file system
contained in a recent upstream Linux kernel? I'm asking sincerely; I'm
so far out of my depth that I'm pretty sure there are sea monsters
swimming around down there. So I can't rule out the possibility that
I've done something wrong either in setup or testing.

During the course of this, I've gotten the reproduction down to
extracting a 2k tar file and then running "du" on the resulting
directory from the client. Doesn't matter if the file is untarred on
the FreeBSD client, the server, or another client. The tar file
contains a directory with a handful of random Javascript files from
Drupal. As far as I can tell, it has something to do with the number,
size, or names of the files. The Drupal project has three separate
directories all structured like this with the same filenames, but the
file contents vary. The issue occurs with all of them.

The Linux /etc/exports file is just:

/data 192.168.201.0/24(rw,sync)

(The production case also uses crossmnt and no_subtree_check, anonuid,
and anongid, but I eliminated those one by one to make sure they
weren't responsible.)

The corresponding fstab entry on the FreeBSD 14.2-RELEASE client is:

192.168.201.200:/data /data nfs rw,tcp,nfsv4,minorversion=2 0 0

Out of curiosity, do you see the problem recur with nfsv3 or the other
NFSv4 minor versions?


One additional thing I noticed that really blew my mind is that I can
shutdown both the client and the server, wait, power them back on, and
the issue is still there. So it's not something in RAM.  That prompted
me to try "touch x" in the directory to create a new 0-length file.
The issue then goes away. Then I can "rm x" and the issue comes back.
By contrast, I can write megabytes from /dev/random into one of the
files without affecting anything; the issue stays the same.

I then tried it with all empty files using the same filenames. The
issue still occurred. Add or remove one file and the issue goes away.
I then renamed one of the files to zz.js. Issue still occurs. Renamed
it to zzz.js. Problem still occurs. Kept going until I got to
zzzzzz.js and it worked.

Finally, I got it to the point where running this in an empty mounted
directory will create the issue:

rm *.xx; for a in a b c d e f g h ; do for b in 1 2 3 4 5 6 7 ; do
touch $a$b.xx ; done; done; for a in 1 2 3 4 5; do touch x$a-xx.xx;
done; touch y0-xxxxxx.xx

and this will not:

rm *.xx; for a in a b c d e f g h ; do for b in 1 2 3 4 5 6 7 ; do
touch $a$b.xx ; done; done; for a in 1 2 3 4 5; do touch x$a-xx.xx;
done; touch y0-xxxxxxx.xx

(The difference being one extra x in the last filename.)

It works in the other direction as well. This causes the issue:

rm *.xx; for a in a b c d e f g h ; do for b in 1 2 3 4 5 6 7 ; do
touch $a$b.xx ; done; done; for a in 1 2 3 4 5; do touch x$a-xx.xx;
done; touch y0-xxx.xx

This does not:

rm *.xx; for a in a b c d e f g h ; do for b in 1 2 3 4 5 6 7 ; do
touch $a$b.xx ; done; done; for a in 1 2 3 4 5; do touch x$a-xx.xx;
done; touch y0-xx.xx

There's a four-character window involving the length of the filenames
where 62 files in a directory causes this issue. There's a little more
to it than that; it doesn't look like you can just create 61
two-letter filenames and then one really long one and get the issue.

So I haven't found the specifics yet, but perhaps due to pure chance
this directory structure is exactly right to provoke an incredibly
obscure edge case?

Well it's likely that this is a problem with READDIR, so file content
is not going to be an issue. The file name lengths are the problem.

Also, I'm wondering what the FreeBSD client's directory readdir
arguments are (how much does it request, what are the maximum limits it
negotiates, and so on). Rick?
As you'll see in the packet trace:
Sequence: cache this: No
Putfh: directory fh
Readdir:
     cookie: 0
     cookie_verf: 0
     dircount: 8706
     maxcount: 8706
     attr: type, RDattr_error, fileid, mounted_on_fleid
Getattr: same attributes as requested for a previous GETATTR, mainly
               to keep the directory's attribute cache up to date.

The session negotiates a max request/reply size of just over 1Mbyte and a
maximum of something like 20 ops. (Can't recall, but definitely more than 4.)

If you are wondering where the 8706 comes from, it was an estimate of how
much would be needed to fill an 8K buffer with the XDR translated to UFS dirents
by adding 512 to 8K.

I have not yet had a chance to see if I can reproduce the problem with
J. David's
reproducer. I will try that soon, and if I can reproduce it, I will
poke at it to try and
figure out what is going on.
Just fyi, I have reproduced it. Once you use J. David's little shell script to
create the files in the directory, the Readdir RPC gets the junk reply
to GETATTR
(the count of words for the attribute bitmap in the reply is 0 instead of 2).
You can unmount/remount it and still get the failure, assuming you do not
mess with the directory contents.

Good work finding the reproducer, J. David!

I will start to poke around to see if I can figure out what the knfsd server is
doing.

Chuck, I suspect any fairly recent FreeBSD client will be sufficient to
reproduce this, just in case you are inspired to cross over to the dark
side and install FreeBSD somewhere.

I see the same malformed GETATTR result in the attachments.

Linux doesn't trip on this issue because it's NFS client doesn't ever
append a GETATTR operation after a READDIR.

So I've installed a small FreeBSD 14.2 guest, and copied the reproducer
script over to it. I see the extra GETATTR now, and I'm trying to
figure out what is causing the corrupted reply. At first glance, I
can see the problem involves a particularly placed page boundary in
the XDR encoding buffer, but it isn't a problem with GETATTR encoding
per se.


I'll post when I have more info, rick


rick


Since this isn't reproducible (yet) with a Linux client, let's try
another set of network captures, and you can send these to me
privately.

Start the capture
Mount
Run one of the reproducers above
Unmount
Stop the capture

I'd like to see one with v6.13-rc3 and ext4 that works as expected, and
one with the same configuration that fails.

--
Chuck Lever


--
Chuck Lever




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux