Re: [PATCH v7 3/3] shmem: stable directory offsets

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 30 Jun 2023 at 13:49:03 -0400, Chuck Lever wrote:
> From: Chuck Lever <chuck.lever@xxxxxxxxxx>
>
> The current cursor-based directory offset mechanism doesn't work
> when a tmpfs filesystem is exported via NFS. This is because NFS
> clients do not open directories. Each server-side READDIR operation
> has to open the directory, read it, then close it. The cursor state
> for that directory, being associated strictly with the opened
> struct file, is thus discarded after each NFS READDIR operation.
>
> Directory offsets are cached not only by NFS clients, but also by
> user space libraries on those clients. Essentially there is no way
> to invalidate those caches when directory offsets have changed on
> an NFS server after the offset-to-dentry mapping changes. Thus the
> whole application stack depends on unchanging directory offsets.
>
> The solution we've come up with is to make the directory offset for
> each file in a tmpfs filesystem stable for the life of the directory
> entry it represents.
>
> shmem_readdir() and shmem_dir_llseek() now use an xarray to map each
> directory offset (an loff_t integer) to the memory address of a
> struct dentry.

I believe this patch is responsible for a tmpfs behaviour change when
a directory is modified while being read.  The following test program

#include <dirent.h>
#include <err.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/stat.h>
#include <unistd.h>

int main(int argc, char *argv[]) {
	const char *tmp = "/tmp";
	if (argc >= 2)
		tmp = argv[1];

	char *dir_path;
	if (asprintf(&dir_path, "%s/foo.XXXXXX", tmp) < 0)
		err(EXIT_FAILURE, "asprintf()");

	if (!mkdtemp(dir_path))
		err(EXIT_FAILURE, "mkdtemp(%s)", dir_path);

	char *file_path;
	if (asprintf(&file_path, "%s/bar", dir_path) < 0)
		err(EXIT_FAILURE, "asprintf()");

	if (creat(file_path, 0644) < 0)
		err(EXIT_FAILURE, "creat(%s)", file_path);

	DIR *dir = opendir(dir_path);
	if (!dir)
		err(EXIT_FAILURE, "opendir(%s)", dir_path);

	struct dirent *de;
	while ((de = readdir(dir))) {
		printf("readdir(): %s/%s\n", dir_path, de->d_name);
		if (de->d_name[0] == '.')
			continue;

		if (unlink(file_path) != 0)
			err(EXIT_FAILURE, "unlink(%s)", file_path);

		if (creat(file_path, 0644) < 0)
			err(EXIT_FAILURE, "creat(%s)", file_path);
	}

	return EXIT_SUCCESS;
}

when run on Linux 6.5, doesn't print the new directory entry:

tavianator@graphene $ uname -a
Linux graphene 6.5.9-arch2-1 #1 SMP PREEMPT_DYNAMIC Thu, 26 Oct 2023 00:52:20 +0000 x86_64 GNU/Linux
tavianator@graphene $ gcc -Wall foo.c -o foo
tavianator@graphene $ ./foo
readdir(): /tmp/foo.wgmdmm/.
readdir(): /tmp/foo.wgmdmm/..
readdir(): /tmp/foo.wgmdmm/bar

But on Linux 6.6, readdir() never stops:

tavianator@tachyon $ uname -a
Linux tachyon 6.6.1-arch1-1 #1 SMP PREEMPT_DYNAMIC Wed, 08 Nov 2023 16:05:38 +0000 x86_64 GNU/Linux
tavianator@tachyon $ gcc foo.c -o foo
tavianator@tachyon $ ./foo
readdir(): /tmp/foo.XnIRqj/.
readdir(): /tmp/foo.XnIRqj/..
readdir(): /tmp/foo.XnIRqj/bar
readdir(): /tmp/foo.XnIRqj/bar
readdir(): /tmp/foo.XnIRqj/bar
readdir(): /tmp/foo.XnIRqj/bar
readdir(): /tmp/foo.XnIRqj/bar
readdir(): /tmp/foo.XnIRqj/bar
readdir(): /tmp/foo.XnIRqj/bar
readdir(): /tmp/foo.XnIRqj/bar
...
foo: creat(/tmp/foo.TTL6Fg/bar): Too many open files

POSIX says[1]

> If a file is removed from or added to the directory after the most recent
> call to opendir() or rewinddir(), whether a subsequent call to readdir()
> returns an entry for that file is unspecified.

so this isn't necessarily a *bug*, but I just wanted to point out the
behaviour change.  I only noticed it because it broke one of my tests in
bfs[2] (in a non-default build configuration).

[1]: https://pubs.opengroup.org/onlinepubs/9699919799/functions/readdir.html
[2]: https://github.com/tavianator/bfs/blob/main/tests/gnu/ignore_readdir_race_notdir.sh




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux