Re: [PATCH] fstests: generic, fsync fuzz tester with fsstress

Filipe Manana <fdmanana@xxxxxxxxxx> · Thu, 16 May 2019 10:54:57 +0100

On Thu, May 16, 2019 at 10:30 AM Theodore Ts'o <tytso@xxxxxxx> wrote:
>
> On Wed, May 15, 2019 at 04:02:21PM +0100, fdmanana@xxxxxxxxxx wrote:
> > From: Filipe Manana <fdmanana@xxxxxxxx>
> >
> > Run fsstress, fsync every file and directory, simulate a power failure and
> > then verify the all files and directories exist, with the same data and
> > metadata they had before the power failure.
> >
> > This tes has found already 2 bugs in btrfs, that caused mtime and ctime of
> > directories not being preserved after replaying the log/journal and loss
> > of a directory's attributes (such a UID and GID) after replaying the log.
> > The patches that fix the btrfs issues are titled:
> >
> >   "Btrfs: fix wrong ctime and mtime of a directory after log replay"
> >   "Btrfs: fix fsync not persisting changed attributes of a directory"
> >
> > Running this test 1000 times:
> >
> > - on ext4 it has resulted in about a dozen journal checksum errors (on a
> >   5.0 kernel) that resulted in failure to mount the filesystem after the
> >   simulated power failure with dmflakey, which produces the following
> >   error in dmesg/syslog:
> >
> >     [Mon May 13 12:51:37 2019] JBD2: journal checksum error
> >     [Mon May 13 12:51:37 2019] EXT4-fs (dm-0): error loading journal
>
> I'm curious what configuration you used when you ran the test.  I

Default configuration, MKFS_OPTIONS="" and MOUNT_OPTIONS="", 5.0 kernel.

I have logs with all the fsstress seed values kept around.

>From one of the failures, the .full file:

Discarding device blocks: done
Creating filesystem with 5242880 4k blocks and 1310720 inodes
Filesystem UUID: 4bb2559c-12ea-45fa-810e-00c513b00dee
Superblock backups stored on blocks:
        32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
        4096000

Allocating group tables: done
Writing inode tables: done
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information: done

Running fsstress with arguments: -p 4 -n 100 -d
/home/fdmanana/btrfs-tests/scratch_1/test -f mknod=0 -f symlink=0
seed = 1558078129
_check_generic_filesystem: filesystem on /dev/sdc is inconsistent
*** fsck.ext4 output ***
fsck from util-linux 2.29.2
e2fsck 1.43.4 (31-Jan-2017)
Journal superblock is corrupt.
Fix? no

fsck.ext4: The journal superblock is corrupt while checking journal for /dev/sdc
e2fsck: Cannot proceed with file system check

/dev/sdc: ********** WARNING: Filesystem still has errors **********

*** end fsck.ext4 output
*** mount output ***
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
udev on /dev type devtmpfs (rw,nosuid,relatime,mode=755)
devpts on /dev/pts type devpts
(rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000)
tmpfs on /run type tmpfs (rw,nosuid,noexec,relatime,size=788996k,mode=755)
/dev/sda1 on / type ext4 (rw,relatime,discard,errors=remount-ro)
securityfs on /sys/kernel/security type securityfs
(rw,nosuid,nodev,noexec,relatime)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev)
tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k)
tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755)
cgroup on /sys/fs/cgroup/systemd type cgroup
(rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/lib/systemd/systemd-cgroups-agent,name=systemd)
pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime)
cgroup on /sys/fs/cgroup/devices type cgroup
(rw,nosuid,nodev,noexec,relatime,devices)
cgroup on /sys/fs/cgroup/freezer type cgroup
(rw,nosuid,nodev,noexec,relatime,freezer)
cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids)
cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup
(rw,nosuid,nodev,noexec,relatime,net_cls,net_prio)
cgroup on /sys/fs/cgroup/perf_event type cgroup
(rw,nosuid,nodev,noexec,relatime,perf_event)
cgroup on /sys/fs/cgroup/memory type cgroup
(rw,nosuid,nodev,noexec,relatime,memory)
cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup
(rw,nosuid,nodev,noexec,relatime,cpu,cpuacct)
cgroup on /sys/fs/cgroup/cpuset type cgroup
(rw,nosuid,nodev,noexec,relatime,cpuset)
cgroup on /sys/fs/cgroup/blkio type cgroup
(rw,nosuid,nodev,noexec,relatime,blkio)
systemd-1 on /proc/sys/fs/binfmt_misc type autofs
(rw,relatime,fd=40,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=1624)
hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,pagesize=2M)
debugfs on /sys/kernel/debug type debugfs (rw,relatime)
mqueue on /dev/mqueue type mqueue (rw,relatime)
tmpfs on /run/user/1000 type tmpfs
(rw,nosuid,nodev,relatime,size=788992k,mode=700,uid=1000,gid=1000)
tracefs on /sys/kernel/debug/tracing type tracefs (rw,relatime)
*** end mount output

Haven't tried ext4 with 1 process only (instead of 4), but I can try
to see if it happens without concurrency as well.

> tried to reproduce it, and had no luck:
>
> TESTRUNID: tytso-20190516042341
> KERNEL:    kernel 5.1.0-rc3-xfstests-00034-g0c72924ef346 #999 SMP Wed May 15 00:56:08 EDT 2019 x86_64
> CMDLINE:   -c 4k -C 1000 generic/547
> CPUS:      2
> MEM:       7680
>
> ext4/4k: 1000 tests, 1855 seconds
> Totals: 1000 tests, 0 skipped, 0 failures, 0 errors, 1855s
>
> FSTESTPRJ: gce-xfstests
> FSTESTVER: blktests baccddc (Wed, 13 Mar 2019 00:06:50 -0700)
> FSTESTVER: fio  fio-3.2 (Fri, 3 Nov 2017 15:23:49 -0600)
> FSTESTVER: fsverity bdebc45 (Wed, 5 Sep 2018 21:32:22 -0700)
> FSTESTVER: ima-evm-utils 0267fa1 (Mon, 3 Dec 2018 06:11:35 -0500)
> FSTESTVER: nvme-cli v1.7-35-g669d759 (Tue, 12 Mar 2019 11:22:16 -0600)
> FSTESTVER: quota  62661bd (Tue, 2 Apr 2019 17:04:37 +0200)
> FSTESTVER: stress-ng 7d0353cf (Sun, 20 Jan 2019 03:30:03 +0000)
> FSTESTVER: syzkaller bab43553 (Fri, 15 Mar 2019 09:08:49 +0100)
> FSTESTVER: xfsprogs v5.0.0 (Fri, 3 May 2019 12:14:36 -0500)
> FSTESTVER: xfstests-bld 9582562 (Sun, 12 May 2019 00:38:51 -0400)
> FSTESTVER: xfstests linux-v3.8-2390-g64233614 (Thu, 16 May 2019 00:12:52 -0400)
> FSTESTCFG: 4k
> FSTESTSET: generic/547
> FSTESTOPT: count 1000 aex
> GCE ID:    8592267165157073108