Re: Fwd: mismatches in raid1, clarification please

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue Jun 11, 2013 at 04:39:56PM +0100, Andrew Brooks wrote:

> On 11 June 2013 16:17, Robin Hill <robin@xxxxxxxxxxxxxxx> wrote:
> >> If there has been no read error then how does it know which block is good?
> >>
> > It doesn't. I'm not sure what the algorithm is to pick which disk to use
> > as the reference value
> 
> I wonder if someone on this list will know or whether I should ask on
> the kernel list.
> 
Neil will know, and others may well be able to make more sense of the
kernel code than I can.

> > but one disk is picked and that data is then written to the other mirrors.
> > It's not ideal
> 
> No kidding, it potentially catastrophic ;-)
> 
> The manual says "It is conceivable for a similar situation to occur on
> non-swap files, though it is less likely."
> but doesn't elaborate on the possible causes.
> 
My understanding is that the in-memory data is not locked, so can be
modified in-between writes to the separate devices. You therefore get
different data written to the devices. Normally this will be rewritten
to any devices which were written before the change but, if the data is
then deleted, this doesn't happen. This is pretty common for swap
data, and other applications could also manage data/memory in the same
way and trigger the same situation. It's not important though as the
data written to disk is no longer referenced within the filesystem.

> Also "Thus the mismatch_cnt value can not be interpreted very reliably on RAID1"
> but if you see it on a non-swap partition then surely it's a very
> serious problem?
> 
It shouldn't be, no. If the data was written correctly to disk then the
mismatches should be irrelevant. If the data was mangled between the CPU
and the disk surface then it's a different matter. There's no way for
the md subsystem to know what the hardware layout is and which data
would be more accurate though, so the repair process merely ensures that
the array is in sync. The added complexity to try to deal with what
should be exceedingly rare circumstances would seem not to be felt to be
worth it.

Cheers,
    Robin
-- 
     ___        
    ( ' }     |       Robin Hill        <robin@xxxxxxxxxxxxxxx> |
   / / )      | Little Jim says ....                            |
  // !!       |      "He fallen in de water !!"                 |

Attachment: signature.asc
Description: Digital signature


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux