Re: Massive RAID-1 desync

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 24 Apr 2015 22:47:49 +0200 (CEST) cau2jeaf1honoq@xxxxxxxxxxx wrote:

> Something is happening here. I don't know what, but I'm having
> fun trying to guess.
> 
> The root file system (ext3) is on a 4 x 30 GB RAID-1 array. A
> couple hours after boot, the kernel detected something wrong in
> the file system and decided to remount it read-only.
> 
> Comparing the component partitions finds many differences with a
> very uneven distribution :
> 
> - sda1 and sdb1 are identical except for 4 bytes in the last
>   70 kB,

Perfectly normal.  Metadata is at the end, at least 64K from the end and 64K
aligned.

> 
> - sdd1 is identical to sda1 and sdb1 except for about 67,000
>   differences in the last 70 kB.

Following the metadata is between 60K and 124K of nothing.  It could easily
be completely different on different devices.


> 
> - sdc1 is grossly out of sync with about 300 million differences
>   with the others, all of them in the first 450 MB or so.

sdc1 is sick.
Maybe it has hardware problems.  Maybe some hacker broke into your machine
and wrote garbage to it.  Or maybe you triggered a bug that no one else has
ever come across (unlikely, but possible).

> 
> I'm not sure what to make of this. The knee-jerk thought would
> be "/dev/sdc1 is the odd man out so sdc must be faulty". But
> that disk participates in other arrays without problems, I don't
> see anything obviously bad in its SMART data and the kernel
> messages just before the remount were actually about sda.

And what were those messages about sda?

> 
> To be honest, I don't have a clear idea of how things got where 
> they are. Since writing to a RAID-1 array writes the same data
> to all devices, how can you have so many differences ?

Cosmic rays?  EMP?

I actually think that the most likely explanation is that someone was
careless and wrote something to sdc that they didn't mean to.  But I'm
probably wrong.  I like guessing too.

NeilBrown


> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Attachment: pgpKTkchLiPYQ.pgp
Description: OpenPGP digital signature


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux