Crash consistency bug in ext4 - interaction between delalloc and fzero

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

We've encountered what seems to be a crash consistency bug in
ext4(kernel 4.15) due to the interaction between delayed allocated
write and an unaligned fallocate(zero range). Say we create a disk
image with known data and quick format it.
1. Now write 65K of data to a new file
2. Zero out a part of the above file using falloc_zero_range (60K+128)
- (60K+128+4096) - an unaligned block
3. fsync the above file
4. Crash

If we crash after the fsync, and allow reordering of the block IOs
between two flush/fua commands using Crashmonkey[1], then we can end
up zeroing the file range from (64K+128) to 65K, which should be
untouched by the fallocate command. We expect this region to contain
the  user written data in step 1 above.

This workload was inspired from xfstest/generic_042, which tests for
stale data exposure using aligned fallocate commands. It's worth
noting that f2fs and btrfs passes our test clean - irrespective of the
order of bios, user data is intact in these filesystems.

To reproduce this bug using CrashMonkey, simply run :
./c_harness -f /dev/sda -d /dev/cow_ram0 -t ext4 -e 10240 -s 1000 -v
tests/generic_042/generic_042_fzero_unaligned.so

and take a look at the <timestamp>-generic_042_fzero_unaligned.log
created in the build directory. This file has the list of block IOs
issued during the workload and the permutation of bios that lead to
this bug. You can also verify using blktrace that CrashMonkey only
reorders bios between two barrier operations(thereby such a crash
state could be encountered due to reordering blocks at the storage
stack). Note that tools like dm-log-writes cannot capture this bug
because this arises due to reordering blocks between barrier
operations.

This seems to a bug, as it is zeroing out user data that is ideally
not supposed to be zeroed by the fallocate command.
Let me know if I am missing some detail here.

[1] https://github.com/utsaslab/crashmonkey.git

Thanks,
Jayashree Mohan



[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux