Re: Extremely High mismatch_cnt on RAID1 system

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Oct 7, 2014, at 8:14 AM, Ethan Wilson <ethan.wilson@xxxxxxxxxxxxx> wrote:

> On 04/10/2014 15:46, Dennis Grant wrote:
>> Hello all.
>> 
>> ...
>> 
>> Even after multiple checks, repairs, and rebuilds, the arrays on the
>> bigger drives (/ and /home) are showing insanely high mismatch_cnt
>> values. This has me concerned.
>> 
> 
> Dennis,
> since nobody more knowledgeable replied, I will try.
> 
> Some mismatches on raid1 have been there since always, and nobody ever deeply investigated what they were caused by, nor if they happen on unallocated filesystem space or on real live data. It seems that if LVM is between raid1 and the filesystem then they don't happen anymore, but again nobody is really sure of why.
> 
> Recently some changes in the raid1 resync algorithm introduced some bugs that could possibly generate additional mismatches, but if you haven't had resyncs then I am not so sure if such bugs and their fixes are relevant. However the fixes are here:
> https://www.kernel.org/pub/linux/kernel/v3.x/ChangeLog-3.14.20
> search for "raid".
> 
> You might want to upgrade to kernel 3.14.20, which is probably not what your Ubuntu LTS has currently, then repair the arrays, then see if they grow again.
> Note that you need to do repair and not check:
> echo repair > /sys/block/md0/md/sync_action
> at the next "check" the mismatch_cnt should be 0 (not just after "repair", because that would count the number of mismatches that have been repaired).
> 
> I'd say that mismatches in general are pretty worrisome, they shouldn't happen, they are likely to indicate corruption, so if what I said doesn't work, e.g. mismatches grow again, try to report it again on the list and somebody might be able to help further to track down this problem.

The mismatches count can be incremented during operations other than check and repair.  I believe its behavior also varies between RAID personalities.  However, if you check the ‘last_sync_action’ and see that it was a “check” operation, you are probably safe to assume that the mismatch count has been computed correctly.

Note the following commit:
commit c4a39551451666229b4ea5e8aae8ca0131d00665
Author: Jonathan Brassow <jbrassow@xxxxxxxxxx>
Date:   Tue Jun 25 01:23:59 2013 -0500

    MD: Remember the last sync operation that was performed

    MD:  Remember the last sync operation that was performed

    This patch adds a field to the mddev structure to track the last
    sync operation that was performed.  This is especially useful when
    it comes to what is recorded in mismatch_cnt in sysfs.  If the
    last operation was "data-check", then it reports the number of
    descrepancies found by the user-initiated check.  If it was a
    "repair" operation, then it is reporting the number of
    descrepancies repaired.  etc.

    Signed-off-by: Jonathan Brassow <jbrassow@xxxxxxxxxx>
    Signed-off-by: NeilBrown <neilb@xxxxxxx>

Relatedly, LVM makes use of the MD RAID personalities to provide its RAID capabilities.  It does this by accessing MD through a thin device-mapper target called "dm-raid” - not to be confused with the similarly named userspace application.  The above mentioned commit contains a change to the dm-raid module as well, which causes it to report ‘0’ mismatches unless the ‘last_sync_action’ was a “check”.  So, for dm-raid (and by extension LVM) the ambiguity in mismatch_count is gone, but the user must be careful when looking at the number for MD.

 brassow--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux