Re: [PATCH] ext4: fix interaction between i_size, fallocate, and delalloc after a crash

Amir Goldstein <amir73il@xxxxxxxxx> · Tue, 17 Oct 2017 00:11:40 +0300

On Mon, Oct 16, 2017 at 10:32 PM, Ashlie Martinez <ashmrtn@xxxxxxxxxx> wrote:
> Amir,
>
> I know this is a bit late, but I've spent some time working through
> the disk image that you provided (so that I could determine how/if I
> could modify CrashMonkey to catch errors like this) and I don't think
> I understand what state the disk image reflects.

The disk image SHOULD reflect a state on a disk after the power was
cut in the middle of mounted fs. Then power came back on, filesystem
was mounted, journal recovered, then filesystem was cleanly unmounted.
At this stage, I don't expect there should be anything interesting in the
journal.

> After digging around
> the journal of the disk image you provided, I found that the first 10
> journal blocks are used, with the journal superblock being placed in
> the very first block of the journal. The journal superblock says that
> the first journal transaction ID that should be in the journal is
> transaction ID 4. However, dumping the other journal blocks, I found
> that the next block is a descriptor block for transaction ID 2. The
> rest of the journal blocks are data blocks for that transaction plus a
> transaction commit block. This seems a little odd considering that the
> journal refers to a 4th transaction, which I have not been able to
> find (I quickly dumped the first 50 blocks in debugfs and found the
> rest to contain only zeros).
>

I did not spend time analyzing the image, so I'll take your word for it,
but I can't help you understand your findings.

> With this in mind, I looked back at the xfstests code for controlling
> the dm_flakey device. What I realized is the `nolockfs` flag is
> provided both when it switches from the real device to the flakey
> device that drops writes and when it switches from the flakey device
> back to the real device. I know there is a call to umount once the
> flakey device that drops writes is inserted, but do you think it is
> possible that the flakey device is swapped back to the real device
> before all the writes forced out by umount have made it to the flakey
> device?

I believe umount call should be blocked until all writes have been flushed
out to flakey device.

> Unfortunately I still don't have a local machine that is
> capable of reproducing your test results and I have not made any gce
> test appliance images to test this yet, so I'm not sure if this is a
> valid theory.
>

Ted explained that the bug related to very specific timing of flusher
thread vs. fallocate thread.
I was under the impression that CrashMonkey can only reorder writes
between recorded FLUSH requests, so I am not really sure how you intent to
modify CrashMonkey to catch this bug.

Cheers,
Amir.