Re: [PATCH v7 05/21] NFS: Store the change attribute in the directory page cache

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 25 Feb 2022, at 10:34, Benjamin Coddington wrote:
Ok, so I'm reading that further proof is required, and I'm happy to do the
work.  Thanks for the replies here and elsewhere.

Here's an example of this problem on a tmpfs export using v8 of your
patchset with the fix to set the change_attr in
nfs_readdir_page_init_array().

I'm using tmpfs, because it reliably orders cookies in reverse order of
creation (or perhaps sorted by name).

The program drives both the client-side and server-side - so on this one
system, /exports/tmpfs is:
tmpfs /exports/tmpfs tmpfs rw,seclabel,relatime,size=102400k 0 0

and /mnt/localhost is:
localhost:/exports/tmpfs /mnt/localhost/tmpfs nfs4 rw,relatime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=127.0.0.1,local_lock=none,addr=127.0.0.1 0 0

The program creates 256 files on the server, walks through them once on the client, deletes the last 127 on the server, drops the first page from the
pagecache, and walks through them again on the client.

The second listing produces 124 duplicate entries.

I just have to say again: this behavior is _new_ (but not new to me), and it is absolutely going to crop up on our customer's systems that are walking
through millions of directory entries on loaded servers under memory
pressure.  The directory listings as a whole become very likely to be
nonsense at random times. I realize they are not /supposed/ to be coherent, but what we're getting here is going to be far far less coherent, and its
going to be a mess.

There are other scenarios that are worse when the cookies aren't ordered,
you can end up with EOF, or get into repeating patterns.

Please compare this with v3, and before this patchset, and tell me if I'm
not justified playing chicken little.

Here's what I do to run this:

mount -t tmpfs -osize=100M tmpfs /exports/tmpfs/
exportfs -ofsid=0 *:/exports
exportfs -ofsid=1 *:/exports/tmpfs
mount -t nfs -ov4.1,sec=sys localhost:/exports /mnt/localhost
./getdents2

Compare "Listing 1" with "Listing 2".

I would also do a "rm -f /export/tmpfs/*" between each run.

Thanks again for your time and work.

Ben

#define _GNU_SOURCE
#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>
#include <sched.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/syscall.h>
#include <string.h>

#define NFSDIR "/mnt/localhost/tmpfs"
#define LOCDIR "/exports/tmpfs"
#define BUF_SIZE 4096

int main(int argc, char **argv)
{
	int i, dir_fd, bpos, total = 0;
    size_t nread;
	struct linux_dirent {
		long           d_ino;
		off_t          d_off;
		unsigned short d_reclen;
		char           d_name[];
	};
    struct linux_dirent *d;
	char buf[BUF_SIZE];

    /* create files: */
    for (i = 0; i < 256; i++) {
        sprintf(buf, LOCDIR "/file_%03d", i);
        close(open(buf, O_CREAT, 666));
    }

	dir_fd = open(NFSDIR, O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC);
	if (dir_fd < 0) {
		perror("cannot open dir");
		return 1;
	}

	while (1) {
		nread = syscall(SYS_getdents, dir_fd, buf, BUF_SIZE);
		if (nread == 0 || nread == -1)
			break;
		for (bpos = 0; bpos < nread;) {
            d = (struct linux_dirent *) (buf + bpos);
            printf("%s\n", d->d_name);
            total++;
            bpos += d->d_reclen;
        }
    }
    printf("Listing 1: %d total dirents\n", total);

    /* rewind */
    lseek(dir_fd, 0, SEEK_SET);

    /* drop the first page */
    posix_fadvise(dir_fd, 0, 4096, POSIX_FADV_DONTNEED);

    /* delete the last 127 files: */
    for (i = 127; i < 256; i++) {
        sprintf(buf, LOCDIR "/file_%03d", i);
        unlink(buf);
    }

    total = 0;
	while (1) {
		nread = syscall(SYS_getdents, dir_fd, buf, BUF_SIZE);
		if (nread == 0 || nread == -1)
			break;
		for (bpos = 0; bpos < nread;) {
            d = (struct linux_dirent *) (buf + bpos);
            printf("%s\n", d->d_name);
            total++;
            bpos += d->d_reclen;
        }
    }
    printf("Listing 2: %d total dirents\n", total);

	close(dir_fd);
	return 0;
}





[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux