Re: raid1 issue after disk failure: both disks of the array are still active

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun Sep 16, 2012 at 12:06:48 +0200, Niccolò Belli wrote:

> Il 15/09/2012 21:41, Robin Hill ha scritto:
> > If md hasn't failed the drive then either:
> >    - md didn't get a read error
> >    - md got a success message when re-writing the block
> >    - there's a bug in md and it's not handled the error at all
> 
> It seems it's case one, while manually verifying the checksums with
> 
> for i in $(seq 50); do dd if=/dev/sda1 of=sda${i} bs=100000 count=50 
> skip=$((($i-1)*50+10)) > /dev/null 2> /dev/null; dd if=/dev/sdb1 
> of=sdb${i} bs=100000 count=50 skip=$((($i-1)*50+10)) > /dev/null 2> 
> /dev/null; md5sum sda${i}; md5sum sdb${i}; echo; done
> 
> I get this in syslog:
> 
> Sep 15 23:50:09 asterisk kernel: [273828.407914] scsi_verify_blk_ioctl: 
> 30 callbacks suppressed
> Sep 15 23:50:09 asterisk kernel: [273828.407920] dd: sending ioctl 
> 80306d02 to a partition!
> Sep 15 23:50:09 asterisk kernel: [273828.407925] dd: sending ioctl 
> 80306d02 to a partition!
> Sep 15 23:50:10 asterisk kernel: [273829.422247] ata3.00: exception 
> Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
> Sep 15 23:50:10 asterisk kernel: [273829.424071] ata3.00: BMDMA stat 0x44
> Sep 15 23:50:10 asterisk kernel: [273829.425855] ata3.00: failed 
> command: READ DMA
> Sep 15 23:50:10 asterisk kernel: [273829.427625] ata3.00: cmd 
> c8/00:00:68:17:00/00:00:00:00:00/e0 tag 0 dma 131072 in
> Sep 15 23:50:10 asterisk kernel: [273829.427627]          res 
> 51/40:00:90:17:00/40:00:00:00:00/e0 Emask 0x9 (media error)
> Sep 15 23:50:10 asterisk kernel: [273829.431184] ata3.00: status: { DRDY 
> ERR }
> Sep 15 23:50:10 asterisk kernel: [273829.432992] ata3.00: error: { UNC }
> Sep 15 23:50:11 asterisk kernel: [273830.404203] ata3.00: configured for 
> UDMA/133
> Sep 15 23:50:11 asterisk kernel: [273830.404217] ata3: EH complete
> 
> 
> 
> but this is the output of the command:
> 
> 
> b7d4e3c3bb461a1aa6619c22ef11d072  sda1
> b7d4e3c3bb461a1aa6619c22ef11d072  sdb1
>
<- snip sets of identical checksums ->
>
> 94f883b45084b72cd9269a4821b2d509  sda50
> 94f883b45084b72cd9269a4821b2d509  sdb50
> 
Okay, so it looks like the drive is managing to return the correct data
eventually (or it's returning some default value which has also been
written to the other mirror now).

> *BUT* if I start reading from the start of partition (+0 instead of +10 
> in count=) I get a mismatch, on both md0 and md1 (which is supposed to 
> be ok)!!!
> 
> root@asterisk:~# i=1; dd if=/dev/sda1 of=sda${i} bs=100000 count=50 
> skip=$((($i-1)*50+0)) > /dev/null 2> /dev/null; dd if=/dev/sdb1 
> of=sdb${i} bs=100000 count=50 skip=$((($i-1)*50+0)) > /dev/null 2> 
> /dev/null; md5sum sda${i}; md5sum sdb${i}
> 9f9f11ffeb0aed0abc8097417b293f41  sda1
> 394efde218ad700774bfcb3c43255529  sdb1
> root@asterisk:~# i=1; dd if=/dev/sda2 of=sda${i} bs=100000 count=50 
> skip=$((($i-1)*50+0)) > /dev/null 2> /dev/null; dd if=/dev/sdb2 
> of=sdb${i} bs=100000 count=50 skip=$((($i-1)*50+0)) > /dev/null 2> 
> /dev/null; md5sum sda${i}; md5sum sdb${i}
> 8cb0b6fa2bf7f0f88a2a2a91598429d4  sda1
> 732c42e14b8e78930d08cdb4f1c49a40  sdb1
> 
> Shouldn't raid1 match even at the very beginning of the partition?
> 
No, the start of the partition will contain the md superblock (for 1.1
and 1.2 metadata formats), which will be slightly different for the two
devices.

Cheers,
    Robin
-- 
     ___        
    ( ' }     |       Robin Hill        <robin@xxxxxxxxxxxxxxx> |
   / / )      | Little Jim says ....                            |
  // !!       |      "He fallen in de water !!"                 |

Attachment: pgplhzy9qrzCC.pgp
Description: PGP signature


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux