Re: mismatch_cnt again

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Piergiorgio Sartor wrote:
Hi,

But unless your drive firmware is broken the drive with only ever give
the correct data or an error. Smart has a counter for blocks that have
gone bad and will be fixed pending a write to them:
Current_Pending_Sector.

The only way the drive should be able to give you bad data is if
multiple bits toggle in such a way that the ECC still fits.

Not really, I've disks which are *perfect* in smart sense
and nevertheless I had mistmatch count.
This was a SW problem, I think now fixed, in RAID-10 code.

IIRC there still is an error in raid-1 code, in that data is written to multiple drives without preventing modification of the memory between writes. As I understand Neil's explanation, this happens (a) when memory is being changed rapidly and frequently via memory mapped files, or (b) writing via O_DIRECT, or (c) when raid-1 is being used for swap. I'm not totally sure why the last one, but I have always seem mismatches on swap in a system which is actually swapping. What is more troubling is that if I do a hibernate, which writes to swap, and then force a boot from other media to a Live-CD, doing a check of the swap array occasionally shows a mismatch. That doesn't give me a secure feeling, although I have never had an issue in practice, I was just curious.

This means that, yes, there could be mismatches, without
any warning, from other sources than disks.
And these could be anywhere in the system.
I already mentioned, time ago, a cabling problem which was
leading to a similar result: wrong data on different disks,
without any warning or error from the HW layer.

That is why it is important to know *where* the mismatch
occurs and, if possible, in which device component.
If it is an empty part of the FS, no problem, if it
belongs to a specific file, then it would be possible
to restore/recreate it.

Of course, a tool will be needed telling which file is
using a certain block of the device.

There are tools which claim to do that, or list blocks used in a given file, which is not nearly as useful, but easier to do.

--
Bill Davidsen <davidsen@xxxxxxx>
 "We can't solve today's problems by using the same thinking we
  used in creating them." - Einstein

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux