--- On Sun, 24/1/10, Goswin von Brederlow <goswin-v-b@xxxxxx> wrote: > From: Goswin von Brederlow <goswin-v-b@xxxxxx> > Subject: Re: Fw: Why does one get mismatches? > To: Jon@xxxxxxxxxxxxxxx > Cc: "Goswin von Brederlow" <goswin-v-b@xxxxxx>, linux-raid@xxxxxxxxxxxxxxx > Date: Sunday, 24 January, 2010, 23:13 > Jon Hardcastle <jd_hardcastle@xxxxxxxxx> > writes: > > > --- On Fri, 22/1/10, Goswin von Brederlow <goswin-v-b@xxxxxx> > wrote: > > > >> From: Goswin von Brederlow <goswin-v-b@xxxxxx> > >> Subject: Re: Fw: Why does one get mismatches? > >> To: Jon@xxxxxxxxxxxxxxx > >> Cc: linux-raid@xxxxxxxxxxxxxxx > >> Date: Friday, 22 January, 2010, 18:13 > >> Jon Hardcastle <jd_hardcastle@xxxxxxxxx> > >> writes: > >> > >> > --- On Tue, 19/1/10, Jon Hardcastle <jd_hardcastle@xxxxxxxxx> > >> wrote: > >> > > >> >> From: Jon Hardcastle <jd_hardcastle@xxxxxxxxx> > >> >> Subject: Why does one get mismatches? > >> >> To: linux-raid@xxxxxxxxxxxxxxx > >> >> Date: Tuesday, 19 January, 2010, 10:04 > >> >> Hi, > >> >> > >> >> I kicked off a check/repair cycle on my > machine > >> after i > >> >> moved the phyiscal ordering of my drives > around > >> and I am now > >> >> on my second check/repair cycle and it > has kept > >> finding > >> >> mismatches. > >> >> > >> >> Is it correct that the mismatch value > after a > >> repair was > >> >> needed should equal the value present > after a > >> check? What if > >> >> it doesn't? What does it mean if another > check > >> STILL reveals > >> >> mismatches? > >> >> > >> >> I had something similar after i reshaped > from raid > >> 5 to 6 i > >> >> had to run check/repair/check/repair > several times > >> before i > >> >> got my 0. > >> >> > >> >> > >> > > >> > Guys, > >> > > >> > Anyone got any suggestions here? I am now on > my ~5 > >> check/repair and after a reboot the first check is > still > >> returning 8. > >> > > >> > All i have done is move the drives around. It > is the > >> same controllers/cables/etc > >> > > >> > I really dont like the seeming random nature > of what > >> can/does/has caused the mismatches? > >> > >> There is some unknown corruption going on with > raid1 that > >> causes > >> mismatches but it is believed that it will never > occur on > >> any used > >> block. Swapping is a likely cause. > >> > >> Any swap device on the raid? Try turning that > off. > >> If that doesn't help try umounting filesystems or > >> remounting RO. > >> > >> MfG > >> Goswin > > > > Hello, my usual savior Goswin! > > > > The deal is it is a 7 drive raid 6 array. it has LVM > on it and is not used for swapping. I have umounted all LV's > and still got mismatches, i run smartctl --test=long on all > drives - nothing. I have now dismantled the array and am 3/4 > the way through 'badblocks -svn' on each of the component > drive. I have a hunch that it may be a dodgy SATA cable but > have no evidence. No errors in log, nothing on dmesg. > > > > Is there any way to get more information? I am > starting to think this is more happened since i changed from > raid 5 to 6..... which i did < 1 month ago. > > > > The only lead i have is that whilst doing the bad > blocks 1 drive ran at ~10~15MB/s whereas the rest are going > at ~30 i have another identical model drive coming up so i > will see if that one is slow too. But the lack of logging > info is not helpful and worrying! and the prospect of silent > corruption a big worry! > > You did run a repair pass and not just repeated check > passes, right? > Check itself only counts the mismatches but does not > correct them. > If the raid is unused (vgchange -a n) and you do first > repair and then > check then that definetly should not find any mismatches. > > MfG > > Goswin > Hello! Yes, I have a simple script that first does a check, then if there are mismatches it does repair. I have then been manually rerunning a check and I keep getting mismatches. I goes like this 232, 8, 24, 8, 8, 16, 16, 24, 24, 8, 16, 24. But I have also done this manually and run several repairs in a row (assuming that will return 0 if no work is to be done) Now the array is completely dismantled and I am running bad blocks on the drives but I am on the last 2 of the 7 drives and I still have no leads. No bad blocks, no offline uncorrectable, no pending sectors no dmesg errors no nothing. I have absolutely no leads what so ever. The only thing i have left to try is a full Mem test and disconnect and reseat the additional sata controllers, oh and buy 7 new sata cables incase 1 is bad. But it would be REALLY helpful to know on what drive the mismatches have occured. Any help here would be gratefully received! I might even try converting the array back to raid 5 as i remember i had mismatches immediately after i converted from 5 to 6. -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html