On Fri, 2022-02-25 at 15:23 -0500, Benjamin Coddington wrote: > On 25 Feb 2022, at 10:34, Benjamin Coddington wrote: > > Ok, so I'm reading that further proof is required, and I'm happy to > > do > > the > > work. Thanks for the replies here and elsewhere. > > Here's an example of this problem on a tmpfs export using v8 of your > patchset with the fix to set the change_attr in > nfs_readdir_page_init_array(). > > I'm using tmpfs, because it reliably orders cookies in reverse order > of > creation (or perhaps sorted by name). > > The program drives both the client-side and server-side - so on this > one > system, /exports/tmpfs is: > tmpfs /exports/tmpfs tmpfs rw,seclabel,relatime,size=102400k 0 0 > > and /mnt/localhost is: > localhost:/exports/tmpfs /mnt/localhost/tmpfs nfs4 > rw,relatime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,hard,prot > o=tcp,timeo=600,retrans=2,sec=sys,clientaddr=127.0.0.1,local_lock=non > e,addr=127.0.0.1 > 0 0 > > The program creates 256 files on the server, walks through them once > on > the > client, deletes the last 127 on the server, drops the first page from > the > pagecache, and walks through them again on the client. > > The second listing produces 124 duplicate entries. > > I just have to say again: this behavior is _new_ (but not new to me), > and it > is absolutely going to crop up on our customer's systems that are > walking > through millions of directory entries on loaded servers under memory > pressure. The directory listings as a whole become very likely to be > nonsense at random times. I realize they are not /supposed/ to be > coherent, > but what we're getting here is going to be far far less coherent, and > its > going to be a mess. > > There are other scenarios that are worse when the cookies aren't > ordered, > you can end up with EOF, or get into repeating patterns. > > Please compare this with v3, and before this patchset, and tell me if > I'm > not justified playing chicken little. > > Here's what I do to run this: > > mount -t tmpfs -osize=100M tmpfs /exports/tmpfs/ > exportfs -ofsid=0 *:/exports > exportfs -ofsid=1 *:/exports/tmpfs > mount -t nfs -ov4.1,sec=sys localhost:/exports /mnt/localhost > ./getdents2 > > Compare "Listing 1" with "Listing 2". > > I would also do a "rm -f /export/tmpfs/*" between each run. > > Thanks again for your time and work. > > Ben > > #define _GNU_SOURCE > #include <stdio.h> > #include <unistd.h> > #include <fcntl.h> > #include <sched.h> > #include <sys/types.h> > #include <sys/stat.h> > #include <sys/syscall.h> > #include <string.h> > > #define NFSDIR "/mnt/localhost/tmpfs" > #define LOCDIR "/exports/tmpfs" > #define BUF_SIZE 4096 > > int main(int argc, char **argv) > { > int i, dir_fd, bpos, total = 0; > size_t nread; > struct linux_dirent { > long d_ino; > off_t d_off; > unsigned short d_reclen; > char d_name[]; > }; > struct linux_dirent *d; > char buf[BUF_SIZE]; > > /* create files: */ > for (i = 0; i < 256; i++) { > sprintf(buf, LOCDIR "/file_%03d", i); > close(open(buf, O_CREAT, 666)); > } > > dir_fd = open(NFSDIR, > O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC); > if (dir_fd < 0) { > perror("cannot open dir"); > return 1; > } > > while (1) { > nread = syscall(SYS_getdents, dir_fd, buf, BUF_SIZE); > if (nread == 0 || nread == -1) > break; > for (bpos = 0; bpos < nread;) { > d = (struct linux_dirent *) (buf + bpos); > printf("%s\n", d->d_name); > total++; > bpos += d->d_reclen; > } > } > printf("Listing 1: %d total dirents\n", total); > > /* rewind */ > lseek(dir_fd, 0, SEEK_SET); > > /* drop the first page */ > posix_fadvise(dir_fd, 0, 4096, POSIX_FADV_DONTNEED); > > /* delete the last 127 files: */ > for (i = 127; i < 256; i++) { > sprintf(buf, LOCDIR "/file_%03d", i); > unlink(buf); > } > > total = 0; > while (1) { > nread = syscall(SYS_getdents, dir_fd, buf, BUF_SIZE); > if (nread == 0 || nread == -1) > break; > for (bpos = 0; bpos < nread;) { > d = (struct linux_dirent *) (buf + bpos); > printf("%s\n", d->d_name); > total++; > bpos += d->d_reclen; > } > } > printf("Listing 2: %d total dirents\n", total); > > close(dir_fd); > return 0; > } tmpfs is broken on the server. It doesn't provide stable cookies, and knfsd doesn't use the verifier to tell you that the cookie assignment has changed. Re-export of tmpfs has never worked reliably. -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@xxxxxxxxxxxxxxx