Re: raid5 (re)-add recovery data corruption

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, 21 Jun 2014 00:31:39 -0500 Bill <billstuff2001@xxxxxxxxxxxxx> wrote:

> Hi Neil,
> 
> I'm running a test on 3.14.8 and seeing data corruption after a recovery.
> I have this array:
> 
>      md5 : active raid5 sdc1[2] sdb1[1] sda1[0] sde1[4] sdd1[3]
>            16777216 blocks level 5, 64k chunk, algorithm 2 [5/5] [UUUUU]
>            bitmap: 0/1 pages [0KB], 2048KB chunk
> 
> with an xfs filesystem on it:
>      /dev/md5 on /hdtv/data5 type xfs 
> (rw,noatime,barrier,swalloc,allocsize=256m,logbsize=256k,largeio)
> 
> and I do this in a loop:
> 
> 1. start writing 1/4 GB files to the filesystem
> 2. fail a disk. wait a bit
> 3. remove it. wait a bit
> 4. add the disk back into the array
> 5. wait for the array to sync and the file writes to finish
> 6. checksum the files.
> 7. wait a bit and do it all again
> 
> The checksum QC will eventually fail, usually after a few hours.
> 
> My last test failed after 4 hours:
> 
>      18:51:48 - mdadm /dev/md5 -f /dev/sdc1
>      18:51:58 - mdadm /dev/md5 -r /dev/sdc1
>      18:52:06 - start writing 3 files
>      18:52:08 - mdadm /dev/md5 -a /dev/sdc1
>      18:52:18 - array recovery done
>      18:52:23 - writes finished. QC failed for one of three files.
> 
> dmesg shows no errors and the disks are operating normally.
> 
> If I "check" /dev/md5 it shows mismatch_cnt = 896
> If I dump the raw data on sd[abcde]1 underneath the bad file, it shows
> sd[abde]1 are correct, and sdc1 has some chunks of old data from a 
> previous file.
> 
> If I fail sdc1, --zero-superblock it, and add it, it then syncs and the 
> QC is correct.
> 
> So somehow is seems like md is loosing track of some changes which need 
> to be
> written to sdc1 in the recovery. But rarely - in this case it failed 
> after 175 cycles.
> 
> Do you have any idea what could be happening here?

No.  As you say, it looks like md is not setting a bit in the bitmap
correctly, or ignoring one that is set, or maybe clearing one that shouldn't
be cleared.
The last is most likely I would guess.

Are you able to run you your test one a slightly older kernel to see how long
the bug has been around.
A full 'git bisect' would be wonderful, but also a lot of work and I don't
really expect it.  Any extra data point would help though.

Maybe I'll see if I can reproduce it myself....

NeilBrown

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux