Re: want-replacement got stuck?

NeilBrown <neilb@xxxxxxx> · Thu, 22 Nov 2012 13:15:45 +1100

On 21 Nov 2012 11:33:00 -0500 "George Spelvin" <linux@xxxxxxxxxxx> wrote:

> Just to follow up to that earlier complaint, ext4 is now noticing some errors:
> 
> Nov 21 06:21:53 science kernel: EXT4-fs error (device md5): ext4_find_entry:1234: inode #5881516: comm rsync: checksumming directory block 0
> Nov 21 07:57:03 science kernel: EXT4-fs error (device md5): ext4_validate_block_bitmap:353: comm flush-9:5: bg 4206: bad block bitmap checksum
> Nov 21 08:41:37 science kernel: EXT4-fs error (device md5): ext4_validate_block_bitmap:353: comm flush-9:5: bg 3960: bad block bitmap checksum
> Nov 21 08:45:18 science kernel: EXT4-fs error (device md5): ext4_validate_block_bitmap:353: comm flush-9:5: bg 4737: bad block bitmap checksum
> Nov 21 08:50:16 science kernel: EXT4-fs error (device md5): ext4_mb_generate_buddy:741: group 4206, 5621 clusters in bitmap, 6888 in gd
> Nov 21 08:50:16 science kernel: JBD2: Spotted dirty metadata buffer (dev = md5, blocknr = 0). There's a risk of filesystem corruption in case of system crash.
> Nov 21 15:50:29 science kernel: EXT4-fs error (device md5): ext4_validate_block_bitmap:353: comm python: bg 4138: bad block bitmap checksum
> Nov 21 16:21:00 science kernel: UDP: bad checksum. From 187.194.52.187:65535 to 71.41.210.146:6881 ulen 70
> 
> I also experienced transient corruption of the last few K of my incoming mailbox.  (I.e. the last
> couple of messages were overwritten with some other text file.  This morning, it's fine.)
> 
> Something is definitely wonky here...  I'm leaving it in the "stuck" state for a while
> in case there's useful debugging info to be extracted, but I'm getting very alarmed by these
> messages and want to reboot soon.

Yes.... this is a real worry.  Fortunately I know what is causing it.

The code for writing to a RAID10 naively assumes that if the 'main' device in
a slot is faulty, then there isn't any replacement device to write to either.

This is normally the case as a faulty device will be promptly remove - or it
should be at least.  As you've already discovered, sometimes it isn't prompt.

But even if it were, there could be races so that the main device fails just
as we look at it, and then the replacement couldn't possibly have been moved
down yet.

Meanwhile you have a corrupted filesystem.  Sorry.
The nature of the corruption is that since the replacement finished no writes
have gone to slot-3 at all.  So if md ever devices to read from slot 3 it
will get stale data.

I suggest you fail the sdd2, reboot, make sure one sda2, sb2,sde2 are in the
array, run fsck, and then if it seems happy enough, add sdc2 and/or sdd2 back
in so they rebuild completely.

Thanks for helping to make md better by risking your data :-)

NeilBrown
Attachment:
signature.asc

Description: PGP signature