Re: [PATCH v7 05/21] NFS: Store the change attribute in the directory page cache

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 2022-02-25 at 15:23 -0500, Benjamin Coddington wrote:
> On 25 Feb 2022, at 10:34, Benjamin Coddington wrote:
> > Ok, so I'm reading that further proof is required, and I'm happy to
> > do 
> > the
> > work.  Thanks for the replies here and elsewhere.
> 
> Here's an example of this problem on a tmpfs export using v8 of your
> patchset with the fix to set the change_attr in
> nfs_readdir_page_init_array().
> 
> I'm using tmpfs, because it reliably orders cookies in reverse order
> of
> creation (or perhaps sorted by name).
> 
> The program drives both the client-side and server-side - so on this
> one
> system, /exports/tmpfs is:
> tmpfs /exports/tmpfs tmpfs rw,seclabel,relatime,size=102400k 0 0
> 
> and /mnt/localhost is:
> localhost:/exports/tmpfs /mnt/localhost/tmpfs nfs4 
> rw,relatime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,hard,prot
> o=tcp,timeo=600,retrans=2,sec=sys,clientaddr=127.0.0.1,local_lock=non
> e,addr=127.0.0.1 
> 0 0
> 
> The program creates 256 files on the server, walks through them once
> on 
> the
> client, deletes the last 127 on the server, drops the first page from
> the
> pagecache, and walks through them again on the client.
> 
> The second listing produces 124 duplicate entries.
> 
> I just have to say again: this behavior is _new_ (but not new to me),
> and it
> is absolutely going to crop up on our customer's systems that are 
> walking
> through millions of directory entries on loaded servers under memory
> pressure.  The directory listings as a whole become very likely to be
> nonsense at random times.  I realize they are not /supposed/ to be 
> coherent,
> but what we're getting here is going to be far far less coherent, and
> its
> going to be a mess.
> 
> There are other scenarios that are worse when the cookies aren't 
> ordered,
> you can end up with EOF, or get into repeating patterns.
> 
> Please compare this with v3, and before this patchset, and tell me if
> I'm
> not justified playing chicken little.
> 
> Here's what I do to run this:
> 
> mount -t tmpfs -osize=100M tmpfs /exports/tmpfs/
> exportfs -ofsid=0 *:/exports
> exportfs -ofsid=1 *:/exports/tmpfs
> mount -t nfs -ov4.1,sec=sys localhost:/exports /mnt/localhost
> ./getdents2
> 
> Compare "Listing 1" with "Listing 2".
> 
> I would also do a "rm -f /export/tmpfs/*" between each run.
> 
> Thanks again for your time and work.
> 
> Ben
> 
> #define _GNU_SOURCE
> #include <stdio.h>
> #include <unistd.h>
> #include <fcntl.h>
> #include <sched.h>
> #include <sys/types.h>
> #include <sys/stat.h>
> #include <sys/syscall.h>
> #include <string.h>
> 
> #define NFSDIR "/mnt/localhost/tmpfs"
> #define LOCDIR "/exports/tmpfs"
> #define BUF_SIZE 4096
> 
> int main(int argc, char **argv)
> {
>         int i, dir_fd, bpos, total = 0;
>      size_t nread;
>         struct linux_dirent {
>                 long           d_ino;
>                 off_t          d_off;
>                 unsigned short d_reclen;
>                 char           d_name[];
>         };
>      struct linux_dirent *d;
>         char buf[BUF_SIZE];
> 
>      /* create files: */
>      for (i = 0; i < 256; i++) {
>          sprintf(buf, LOCDIR "/file_%03d", i);
>          close(open(buf, O_CREAT, 666));
>      }
> 
>         dir_fd = open(NFSDIR,
> O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC);
>         if (dir_fd < 0) {
>                 perror("cannot open dir");
>                 return 1;
>         }
> 
>         while (1) {
>                 nread = syscall(SYS_getdents, dir_fd, buf, BUF_SIZE);
>                 if (nread == 0 || nread == -1)
>                         break;
>                 for (bpos = 0; bpos < nread;) {
>              d = (struct linux_dirent *) (buf + bpos);
>              printf("%s\n", d->d_name);
>              total++;
>              bpos += d->d_reclen;
>          }
>      }
>      printf("Listing 1: %d total dirents\n", total);
> 
>      /* rewind */
>      lseek(dir_fd, 0, SEEK_SET);
> 
>      /* drop the first page */
>      posix_fadvise(dir_fd, 0, 4096, POSIX_FADV_DONTNEED);
> 
>      /* delete the last 127 files: */
>      for (i = 127; i < 256; i++) {
>          sprintf(buf, LOCDIR "/file_%03d", i);
>          unlink(buf);
>      }
> 
>      total = 0;
>         while (1) {
>                 nread = syscall(SYS_getdents, dir_fd, buf, BUF_SIZE);
>                 if (nread == 0 || nread == -1)
>                         break;
>                 for (bpos = 0; bpos < nread;) {
>              d = (struct linux_dirent *) (buf + bpos);
>              printf("%s\n", d->d_name);
>              total++;
>              bpos += d->d_reclen;
>          }
>      }
>      printf("Listing 2: %d total dirents\n", total);
> 
>         close(dir_fd);
>         return 0;
> }


tmpfs is broken on the server. It doesn't provide stable cookies, and
knfsd doesn't use the verifier to tell you that the cookie assignment
has changed.


Re-export of tmpfs has never worked reliably.

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@xxxxxxxxxxxxxxx






[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux