On Tue, Oct 17, 2017 at 3:43 AM, Vijay Chidambaram <vvijay03@xxxxxxxxx> wrote: > Hi Ted, > > This was what we were suspecting as well. If the bug being exposed or not > only depends on the number of transactions in the journal before the test > workload starts, that’s something we could easily capture in CrashMonkey. It > would be one more parameter to tweak, and we would potentially have to get > multiple traces of the test workload with the journal being X percent full. > > It does expand our already-large search space, but our first order of > business is making sure CrashMonkey can reproduce every crash-consistency > bug reported in recent times (mostly by Amir :) ). So for now we were just > analyzing the bug and trying to understand it, but it looks like the > post-recovery image is not very useful for this. > > On Mon, Oct 16, 2017 at 7:09 PM Theodore Ts'o <tytso@xxxxxxx> wrote: >> >> On Tue, Oct 17, 2017 at 12:11:40AM +0300, Amir Goldstein wrote: >> > >> > The disk image SHOULD reflect a state on a disk after the power was >> > cut in the middle of mounted fs. Then power came back on, filesystem >> > was mounted, journal recovered, then filesystem was cleanly unmounted. >> > At this stage, I don't expect there should be anything interesting in >> > the >> > journal. >> >> I suspect what Ashlie was hoping for was a file system image *before* >> the file system was remounted and the journal replayed (and then >> truncated). That would allow for an analysis of image right after the >> simulated power cut, so it could be seen what was in the journal. >> >> The only way to get that is to modify the test so that it aborts >> before the file system is remounted. I did some investigations where >> I ran commands (such as "debugfs -R "logdump -ac /dev/vdc") before the >> file system was remounted to gather debugging information. That's how >> I tracked down the problem. Unfortunately I never bothered to grab >> full file system snapshot, so I can't give Ashlie what she's hoping >> for. >> I can provide the desired file system snapshot after crash, but I will do more than that. I will provide an io recording of the test and will explain how to how to replay the recording to any given point before the crash. To produce the recording, I copied this hunk from test 456: fsxops=$tmp.fsxops cat <<EOF > $fsxops write 0x137dd 0xdc69 0x0 fallocate 0xb531 0xb5ad 0x21446 collapse_range 0x1c000 0x4000 0x21446 write 0x3e5ec 0x1a14 0x21446 zero_range 0x20fac 0x6d9c 0x40000 keep_size mapwrite 0x216ad 0x274f 0x40000 EOF run_check $here/ltp/fsx -d --replay-ops $fsxops $SCRATCH_MNT/testfile to test 455 before line NUM_FILES=4 and changed the line to NUM_FILES=0 to suppress the randomized fsx runs. Now test 455 fails and leaves us a recording of all io on the $LOGWRITES_DEV device. As we can see in 455.full, the consistency check fails on "pre umount" checkpoint, which test 455 has marked in the log device as mark "last": checking pre umount ("_log_writes_replay_log last" in the test) _check_generic_filesystem: filesystem on /dev/mapper/ssd-scratch is inconsistent To export the recording from my log device I ran: # ./src/log-writes/replay-log --log $LOGWRITES_DEV --find --end-mark last 17098@153852 Meaning that the "last" mark is record #17098 at sector offset 153852 on the log device, so I can export content of log up to and including this offset: # dd if=$LOGWRITES_DEV of=ext4-crash.log bs=512 count=153853 The exported log is 76M compressed to 34K and attached to this message. To replay this log on your device and reproduce the fsck error, you need a $SCRATCH_DEV that is at least 10G and run: # bunzip2 ext4-crash.log.bz2 # ./src/log-writes/replay-log --log ext4-crash.log --replay $SCRATCH_DEV --end-mark last Now you can investigate state of file system right after the crash. More importantly, you can use the --check [<number>|flush|fua] command line options to replay-log to run an arbitrary check utility on replay device at any numbers of checkpoints before the crash. You can use --limit and --start-entry to fast forward to a given point in recording and replay with consistency checks from that point on, and so forth. Finally, see that replaying the recording to "last" reproduces the problem: # mount $SCRATCH_DEV $SCRATCH_MNT # umount $SCRATCH_MNT # e2fsck -nf $SCRATCH_DEV Hope this helps. I guess I could have saved Ted some work had I provided this guide and log sooner, but I thought the bug may be trivial enough to understand with just the dm-flakey reproducer - I was wrong about that. Cheers, Amir.
Attachment:
ext4-crash.log.bz2
Description: BZip2 compressed data