RAID1 scrub ignoring read errors?

Niklas Hambüchen <mail@xxxxxx> · Sun, 2 Dec 2018 18:51:17 +0100

Hello,

today I got alerted by mdadm via email that a disk on one of my servers failed.

On the machine, I see /dev/sda1 as faulty:

    Number   Major   Minor   RaidDevice State
       0       0        0        0      removed
       1       8       17        1      active sync   /dev/sdb1

       0       8        1        -      faulty   /dev/sda1

and in dmesg:

    ata1.00: configured for UDMA/133
    sd 0:0:0:0: [sda] tag#18 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
    sd 0:0:0:0: [sda] tag#18 Sense Key : Illegal Request [current] [descriptor] 
    sd 0:0:0:0: [sda] tag#18 Add. Sense: Logical block address out of range
    sd 0:0:0:0: [sda] tag#18 CDB: Write(16) 8a 00 00 00 00 00 00 06 40 10 00 00 00 08 00 00
    blk_update_request: I/O error, dev sda, sector 409616
    md: super_written gets error=-5
    md/raid1:md0: Disk failure on sda1, disabling device.
    md/raid1:md0: Operation continuing on 1 devices.

Note this is a Write(16) error.
However, scrolling up in dmesg, I see lots of Read(16) errors for *both* /dev/sda and /dev/sdb:

For sdb, at [7723679.793801]:

    ata3.00: exception Emask 0x0 SAct 0x7c SErr 0x0 action 0x0
    ata3.00: irq_stat 0x40000008
    ata3.00: failed command: READ FPDMA QUEUED
    ata3.00: cmd 60/00:10:00:6e:e4/0a:00:00:00:00/40 tag 2 ncq 1310720 in
             res 41/40:00:30:73:e4/00:00:00:00:00/40 Emask 0x409 (media error) <F>
    ata3.00: status: { DRDY ERR }
    ata3.00: error: { UNC }
    ata3.00: configured for UDMA/133
    sd 2:0:0:0: [sdb] tag#2 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
    sd 2:0:0:0: [sdb] tag#2 Sense Key : Medium Error [current] [descriptor] 
    sd 2:0:0:0: [sdb] tag#2 Add. Sense: Unrecovered read error - auto reallocate failed
    sd 2:0:0:0: [sdb] tag#2 CDB: Read(16) 88 00 00 00 00 00 00 e4 6e 00 00 00 0a 00 00 00
    blk_update_request: I/O error, dev sdb, sector 14971696
    ata3: EH complete

For sda, at [7723688.533758]:

    ata1.00: exception Emask 0x0 SAct 0x7fffffff SErr 0x0 action 0x0
    ata1.00: irq_stat 0x40000008
    ata1.00: failed command: READ FPDMA QUEUED
    ata1.00: cmd 60/80:18:80:d4:e5/00:00:00:00:00/40 tag 3 ncq 65536 in
             res 41/40:00:b8:d4:e5/00:00:00:00:00/40 Emask 0x409 (media error) <F>
    ata1.00: status: { DRDY ERR }
    ata1.00: error: { UNC }
    ata1.00: configured for UDMA/133
    sd 0:0:0:0: [sda] tag#3 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
    sd 0:0:0:0: [sda] tag#3 Sense Key : Medium Error [current] [descriptor] 
    sd 0:0:0:0: [sda] tag#3 Add. Sense: Unrecovered read error - auto reallocate failed
    sd 0:0:0:0: [sda] tag#3 CDB: Read(16) 88 00 00 00 00 00 00 e5 d4 80 00 00 00 80 00 00
    blk_update_request: I/O error, dev sda, sector 15062200
    ata1: EH complete

Why is it that only sda1 is marked as faulty when both sda and sdb had unrecovered read errors earlier?
Does md consider only write failures real failures?
How does the logic work?

Also note, there are three scrubs in the dmesg ("md: data-check of RAID array md0").
The first three encountered read errors, but nevertheless finished with "md: md0: data-check done.". Only the last scrub that had a write error resulted in md considering the scrub a failure.

You can find the full dmesg at https://gist.github.com/nh2/db886f3afbbb4b186aa5088ca2782c06.

This left me in the inconvenient situation where I have 2 devices in a RAID1 which have apparent errors, but mdadm emailed me only 91 days (judging from dmesg timestamps) after the first read failure occurred.

Any insights would be appreciated.

Niklas