On Sun, 22 Dec 2024 at 21:19, Chuck Lever <chuck.lever@xxxxxxxxxx> wrote: > > On 12/21/24 6:53 PM, Rick Macklem wrote: > > On Sat, Dec 21, 2024 at 3:27 PM Rick Macklem <rick.macklem@xxxxxxxxx> wrote: > >> > >> On Sat, Dec 21, 2024 at 9:34 AM Chuck Lever <chuck.lever@xxxxxxxxxx> wrote: > >>> > >>> On 12/20/24 9:16 PM, J David wrote: > >>>> Hello, > >>>> > >>>> On Tue, Dec 17, 2024 at 8:51 PM Chuck Lever <chuck.lever@xxxxxxxxxx> wrote: > >>>>> If they can reproduce > >>>>> this issue with an "in tree" file system contained in a recent upstream > >>>>> Linux kernel, then we can take a look. (Or you and J. David can give it > >>>>> a try). > >>>> > >>>> Yes, I reproduced this behavior on ext4 with 6.11.5+bpo-amd64 from > >>>> Debian backports on completely different hardware. > >>>> > >>>> Then I set up another NFS server on Arch (running kernel 6.12.4), and > >>>> reproduced the issue there as well. > >>>> > >>>> Then, just to be sure, I went and found the instructions for building > >>>> the Linux kernel from source, built and tested both 6.12.6 and > >>>> 6.13-rc3 as downloaded directly from www.kernel.org, and the issue > >>>> occurs with those as well. > >>> > >>> Reproducing on v6.13-rc with ext4 is all that was necessary, thank you! > >>> > >>> > >>>> Additionally, I have tested every combination of FreeBSD, Linux and > >>>> OpenIndiana as client and server to confirm that FreeBSD client with > >>>> Linux server is the only case where this problem occurs. > >>> > >>> Interesting. > >>> > >>> > >>>> Does this count as reproducing the issue with an "in tree" file system > >>>> contained in a recent upstream Linux kernel? I'm asking sincerely; I'm > >>>> so far out of my depth that I'm pretty sure there are sea monsters > >>>> swimming around down there. So I can't rule out the possibility that > >>>> I've done something wrong either in setup or testing. > >>>> > >>>> During the course of this, I've gotten the reproduction down to > >>>> extracting a 2k tar file and then running "du" on the resulting > >>>> directory from the client. Doesn't matter if the file is untarred on > >>>> the FreeBSD client, the server, or another client. The tar file > >>>> contains a directory with a handful of random Javascript files from > >>>> Drupal. As far as I can tell, it has something to do with the number, > >>>> size, or names of the files. The Drupal project has three separate > >>>> directories all structured like this with the same filenames, but the > >>>> file contents vary. The issue occurs with all of them. > >>>> > >>>> The Linux /etc/exports file is just: > >>>> > >>>> /data 192.168.201.0/24(rw,sync) > >>>> > >>>> (The production case also uses crossmnt and no_subtree_check, anonuid, > >>>> and anongid, but I eliminated those one by one to make sure they > >>>> weren't responsible.) > >>>> > >>>> The corresponding fstab entry on the FreeBSD 14.2-RELEASE client is: > >>>> > >>>> 192.168.201.200:/data /data nfs rw,tcp,nfsv4,minorversion=2 0 0 > >>> > >>> Out of curiosity, do you see the problem recur with nfsv3 or the other > >>> NFSv4 minor versions? > >>> > >>> > >>>> One additional thing I noticed that really blew my mind is that I can > >>>> shutdown both the client and the server, wait, power them back on, and > >>>> the issue is still there. So it's not something in RAM. That prompted > >>>> me to try "touch x" in the directory to create a new 0-length file. > >>>> The issue then goes away. Then I can "rm x" and the issue comes back. > >>>> By contrast, I can write megabytes from /dev/random into one of the > >>>> files without affecting anything; the issue stays the same. > >>>> > >>>> I then tried it with all empty files using the same filenames. The > >>>> issue still occurred. Add or remove one file and the issue goes away. > >>>> I then renamed one of the files to zz.js. Issue still occurs. Renamed > >>>> it to zzz.js. Problem still occurs. Kept going until I got to > >>>> zzzzzz.js and it worked. > >>>> > >>>> Finally, I got it to the point where running this in an empty mounted > >>>> directory will create the issue: > >>>> > >>>> rm *.xx; for a in a b c d e f g h ; do for b in 1 2 3 4 5 6 7 ; do > >>>> touch $a$b.xx ; done; done; for a in 1 2 3 4 5; do touch x$a-xx.xx; > >>>> done; touch y0-xxxxxx.xx > >>>> > >>>> and this will not: > >>>> > >>>> rm *.xx; for a in a b c d e f g h ; do for b in 1 2 3 4 5 6 7 ; do > >>>> touch $a$b.xx ; done; done; for a in 1 2 3 4 5; do touch x$a-xx.xx; > >>>> done; touch y0-xxxxxxx.xx > >>>> > >>>> (The difference being one extra x in the last filename.) > >>>> > >>>> It works in the other direction as well. This causes the issue: > >>>> > >>>> rm *.xx; for a in a b c d e f g h ; do for b in 1 2 3 4 5 6 7 ; do > >>>> touch $a$b.xx ; done; done; for a in 1 2 3 4 5; do touch x$a-xx.xx; > >>>> done; touch y0-xxx.xx > >>>> > >>>> This does not: > >>>> > >>>> rm *.xx; for a in a b c d e f g h ; do for b in 1 2 3 4 5 6 7 ; do > >>>> touch $a$b.xx ; done; done; for a in 1 2 3 4 5; do touch x$a-xx.xx; > >>>> done; touch y0-xx.xx > >>>> > >>>> There's a four-character window involving the length of the filenames > >>>> where 62 files in a directory causes this issue. There's a little more > >>>> to it than that; it doesn't look like you can just create 61 > >>>> two-letter filenames and then one really long one and get the issue. > >>>> > >>>> So I haven't found the specifics yet, but perhaps due to pure chance > >>>> this directory structure is exactly right to provoke an incredibly > >>>> obscure edge case? > >>> > >>> Well it's likely that this is a problem with READDIR, so file content > >>> is not going to be an issue. The file name lengths are the problem. > >>> > >>> Also, I'm wondering what the FreeBSD client's directory readdir > >>> arguments are (how much does it request, what are the maximum limits it > >>> negotiates, and so on). Rick? > >> As you'll see in the packet trace: > >> Sequence: cache this: No > >> Putfh: directory fh > >> Readdir: > >> cookie: 0 > >> cookie_verf: 0 > >> dircount: 8706 > >> maxcount: 8706 > >> attr: type, RDattr_error, fileid, mounted_on_fleid > >> Getattr: same attributes as requested for a previous GETATTR, mainly > >> to keep the directory's attribute cache up to date. > >> > >> The session negotiates a max request/reply size of just over 1Mbyte and a > >> maximum of something like 20 ops. (Can't recall, but definitely more than 4.) > >> > >> If you are wondering where the 8706 comes from, it was an estimate of how > >> much would be needed to fill an 8K buffer with the XDR translated to UFS dirents > >> by adding 512 to 8K. > >> > >> I have not yet had a chance to see if I can reproduce the problem with > >> J. David's > >> reproducer. I will try that soon, and if I can reproduce it, I will > >> poke at it to try and > >> figure out what is going on. > > Just fyi, I have reproduced it. Once you use J. David's little shell script to > > create the files in the directory, the Readdir RPC gets the junk reply > > to GETATTR > > (the count of words for the attribute bitmap in the reply is 0 instead of 2). > > You can unmount/remount it and still get the failure, assuming you do not > > mess with the directory contents. > > > > Good work finding the reproducer, J. David! > > > > I will start to poke around to see if I can figure out what the knfsd server is > > doing. > > > > Chuck, I suspect any fairly recent FreeBSD client will be sufficient to > > reproduce this, just in case you are inspired to cross over to the dark > > side and install FreeBSD somewhere. > > I see the same malformed GETATTR result in the attachments. > > Linux doesn't trip on this issue because it's NFS client doesn't ever > append a GETATTR operation after a READDIR. Windows ms-nfs41-client driver also does GETATTR after READDIR, and trips over bogus return values on a regular basis. Solaris 11.4 nfsd and nfs4j do not exhibit such problems. Another garbage value that client gets from Linux nfsd is FATTR4_WORD0_CHANGE, which sometimes returns absurdly high values. Maybe add some WARN_ONCE() to Linux nfsd if unexpected crazy values are sent over the wire? Ced -- Cedric Blancher <cedric.blancher@xxxxxxxxx> [https://plus.google.com/u/0/+CedricBlancher/] Institute Pasteur