Re: Regression in handling power cuts since 3a1e819b4e80 ("ovl: store file handle of lower inode on copy up")

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 22 Oct 2018 at 10:57, Amir Goldstein <amir73il@xxxxxxxxx> wrote:
> On Mon, Oct 22, 2018 at 11:26 AM Richard Weinberger <richard@xxxxxx> wrote:
> >
> > Am Montag, 22. Oktober 2018, 09:14:08 CEST schrieb Rafał Miłecki:
> > > On Fri, 19 Oct 2018 at 14:31, Rafał Miłecki <zajec5@xxxxxxxxx> wrote:
> > > > Since OpenWrt switch from kernel 4.9 to 4.14 users started randomly
> > > > reporting file system corruptions. OpenWrt uses overlay(fs) with
> > > > squashfs as lowerdir and ubifs as upperdir. Russell managed to isolate
> > > > & describe test case for reproducing corruption when doing a power cut
> > > > after first boot.
> > > >
> > > > (...)
> > > >
> > > > Can I ask you to check if there is something possibly wrong with the
> > > > above ovl commit? Or does it expose some problem with the ubifs? Or
> > > > maybe the whole UBI?
> > > >
> > > > FWIW testing above commit (and one before it) always results in single
> > > > error in the kernel log:
> > > > [   14.250184] UBIFS error (ubi0:1 pid 637): ubifs_add_orphan: orphaned twice
> > > >
> > > > That UBIFS error doesn't occur with 4.12.14. Unfortunately it's
> > > > impossible to cleanly revert 3a1e819b4e80 from the top of 4.12.14.
> > >
> > > Let me provide a summary of all relevant commits & tests:
> > >
> > > By "Corruption" I mean file system corruption after power cut
> >
> > Well, is the filesystem not consistent anymore?
> > From what Russel explained to me, I thought the main problem is that no write back happens.
> > IOW the inode is present, has correct length, but no content is there (all zeros).
> >
> > Just like the typical case where userspace does not fsync.
> > But in your case sooner or later write back should have happened because the writeback timer
> > fires at some point.
> >
>
> For the records overlayfs does:
> - open(O_TMPFILE)
> - setxattr() [with 3a1e819b4e80]
> - write to tmpfile
> - fsync tmpfile
> - link tmpfile
>
> I suggest that you try the same from user space on ubifs.

Are you 100% sure about it? I tried writing C app behaving as you
described above and I could not reproduce the problem.

Then I took a close look at ovl_copy_up_locked() and it seems above
info isn't accurate. It seems to me that setxattr() happens between
fsync and link. I modified my C app to follow that order (open, write,
fsync, setxattr, link) and I can reproduce the problem now!

Steps to reproduce the problem:
1) compile tmptest.c
2) tmptest /overlay/upper/foo.txt user.bar baz
3) wait 5 seconds (so ubifs writes to flash)
4) power cut
5) boot again and check content of /overlay/upper/foo.txt
6) in my case content appears to be 00 00 00 00

Is this correct for ovl to call vfs_setxattr() *after* calling vfs_fsync()?

-- 
Rafał
#include <errno.h>
#include <fcntl.h>
#include <libgen.h>
#include <linux/limits.h>
#include <stdio.h>
#include <string.h>
#include <sys/xattr.h>
#include <unistd.h>

static const char foo[] = "foo\n";

int main(int argc, char **argv) {
	char path[PATH_MAX];
	char *dir;
	int err;
	int fd;

	if (argc < 2) {
		fprintf(stderr, "You have to pass file as argument\n");
		return -ENOENT;
	}

	dir = strdup(argv[1]);
	if (!dir) {
		return -ENOMEM;
	}
	dir = dirname(dir);
	fd = open(dir, O_TMPFILE | O_RDWR);
	if (fd < 0) {
		err = -errno;
		fprintf(stderr, "Failed to open O_TMPFILE: %d\n", err);
		return err;
	}

#if 0
	if (argc >= 4) {
		if (fsetxattr(fd, argv[2], argv[3], strlen(argv[3]), 0) != 0) {
			err = -errno;
			fprintf(stderr, "Failed to fsetxattr(): %d\n", err);
			return err;
		}
		printf("Wrote xattr \"%s\" value \"%s\"\n", argv[2], argv[3]);
	}
#endif

	if (write(fd, foo, sizeof(foo)) != sizeof(foo)) {
		err = -errno;
		fprintf(stderr, "Failed to write(): %d\n", err);
		return err;
	}

	if (fsync(fd) < 0) {
		err = -errno;
		fprintf(stderr, "Failed to fsync(): %d\n", err);
		return err;
	}

#if 1
	if (argc >= 4) {
		if (fsetxattr(fd, argv[2], argv[3], strlen(argv[3]), 0) != 0) {
			err = -errno;
			fprintf(stderr, "Failed to fsetxattr(): %d\n", err);
			return err;
		}
		printf("Wrote xattr \"%s\" value \"%s\"\n", argv[2], argv[3]);
	}
#endif

	snprintf(path, sizeof(path), "/proc/self/fd/%d", fd);
	if (linkat(-1, path, AT_FDCWD, argv[1], AT_SYMLINK_FOLLOW) != 0) {
		err = -errno;
		fprintf(stderr, "Failed to linkat(): %d\n", err);
		return err;
	}

	printf("Success!\n");

	return 0;
}

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux