Re: breaking ext4 to test recovery

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2011-03-29, at 3:50 AM, Eric Sandeen wrote:
> On 3/28/11 9:45 PM, Daniel Taylor wrote:
>> I would like to be able to break our ext4 file system
>> (specifically corrupt the journal) to be sure that we
>> can automatically notice the problem and attempt an
>> autonomous fix.
>> 
>> dumpe2fs tells me the inode, but not, that I can see, the
>> blocks where the journal exists (for "dd"ing junk to it).
>> 
>> Is there any debug tool that would let me deliberately
>> break the file system (at least, trash the journal)?
>> 
>> If not, is there a hint for figuring out the block(s) of
>> the journal so I can stomp it?
>> 
>> The kernel is in an embedded machine, so it's a little old
>> 2.6.32.11 and e2fsprogs/libs 1.41.12-2 (Lenny)
> 
> But are you trying to test in-kernel recovery, or e2fsck, after
> you corrupt the journal?  Or both?
> 
> I assume you'd start with a filesystem with a dirty log,
> corrupt that log, and then what, fsck it, or try to mount it?
> 
> How are you generating your fs w/ dirty log?
> 
> (xfs has an ioctl to abruptly "stop" the fs as if it had crashed,
> that would be very useful in extN as well).

We have a kernel patch "dev_read_only" that we use with Lustre to disable writes to the block device while the device is in use.  This allows simulating crashes at arbitrary points in the code or test scripts.  It was based on Andrew Morton's test harness that he used for ext3 recovery testing back when it was being ported to the 2.4 kernel.

http://git.whamcloud.com/?p=fs/lustre-release.git;a=blob_plain;f=lustre/kernel_patches/patches/dev_read_only-2.6.32-rhel6.patch;hb=HEAD

The best part of this patch is that it works with any block device, can simulate power failure w/o any need for automated power control, and once the block device is unused (all buffers and references dropped) it can be re-activated safely.

> Another thing which could use lots more testing in the wild is
> simple journal recovery; nothing is corrupted, but the drive got
> unplugged or the system lost power while the fs was under load;
> see if a mount; umount; fsck and/or if a fsck; mount; umount; fsck finds
> errors.
> 
> (the former will test in-kernel log recovery, the latter will test
> log recovery in e2fsck).

Cheers, Andreas





--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux