One of the servers I've been setting up, which has an md RAID0 for temporary storage, has just had a disk error. root@storage2:~# ls -l /disk/scratch/scratch/path/to/file ls: cannot access /disk/scratch/scratch/path/to/file/file.4000.new.1521.rsi: Remote I/O error ls: cannot access /disk/scratch/scratch/path/to/file/file.4000.new.1522.rsi: Remote I/O error ls: cannot access /disk/scratch/scratch/path/to/file/file.4000.new.1523.rsi: Remote I/O error ... dmesg shows: [ 1232.406491] mpt2sas1: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000) [ 1232.406497] mpt2sas1: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000) [ 1232.406512] sd 5:0:0:0: [sdr] Unhandled sense code [ 1232.406514] sd 5:0:0:0: [sdr] Result: hostbyte=invalid driverbyte=DRIVER_SENSE [ 1232.406518] sd 5:0:0:0: [sdr] Sense Key : Medium Error [current] [ 1232.406522] Info fld=0x30000588 [ 1232.406524] sd 5:0:0:0: [sdr] Add. Sense: Unrecovered read error [ 1232.406528] sd 5:0:0:0: [sdr] CDB: Read(10): 28 00 30 00 05 80 00 00 10 00 [ 1232.406537] end_request: critical target error, dev sdr, sector 805307776 OK, so that's fairly obviously a failed drive. The problem is, how to detect and report this? At the md RAID level, `cat /proc/mdstat` and `mdadm --detail` show nothing amiss. # cat /proc/mdstat Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] md127 : active raid0 sdk[8] sdf[4] sdb[0] sdj[9] sdc[1] sde[2] sdd[3] sdi[6] sdg[5] sdh[7] sdv[20] sdw[21] sdl[11] sdu[19] sdt[18] sdn[13] sds[17] sdq[14] sdm[10] sdx[22] sdr[16] sdo[12] sdp[15] sdy[23] 70326362112 blocks super 1.2 512k chunks unused devices: <none> root@storage2:~# mdadm --detail /dev/md/scratch /dev/md/scratch: Version : 1.2 Creation Time : Mon Apr 23 16:53:59 2012 Raid Level : raid0 Array Size : 70326362112 (67068.45 GiB 72014.19 GB) Raid Devices : 24 Total Devices : 24 Persistence : Superblock is persistent Update Time : Mon Apr 23 16:53:59 2012 State : clean Active Devices : 24 Working Devices : 24 Failed Devices : 0 Spare Devices : 0 Chunk Size : 512K Name : storage2:scratch (local to host storage2) UUID : e5d2dce6:91d1d3b9:ae08f838:5e12132a Events : 0 Number Major Minor RaidDevice State 0 8 16 0 active sync /dev/sdb 1 8 32 1 active sync /dev/sdc 2 8 64 2 active sync /dev/sde 3 8 48 3 active sync /dev/sdd 4 8 80 4 active sync /dev/sdf 5 8 96 5 active sync /dev/sdg 6 8 128 6 active sync /dev/sdi 7 8 112 7 active sync /dev/sdh 8 8 160 8 active sync /dev/sdk 9 8 144 9 active sync /dev/sdj 10 8 192 10 active sync /dev/sdm 11 8 176 11 active sync /dev/sdl 12 8 224 12 active sync /dev/sdo 13 8 208 13 active sync /dev/sdn 14 65 0 14 active sync /dev/sdq 15 8 240 15 active sync /dev/sdp 16 65 16 16 active sync /dev/sdr 17 65 32 17 active sync /dev/sds 18 65 48 18 active sync /dev/sdt 19 65 64 19 active sync /dev/sdu 20 65 80 20 active sync /dev/sdv 21 65 96 21 active sync /dev/sdw 22 65 112 22 active sync /dev/sdx 23 65 128 23 active sync /dev/sdy So first question is this: what does it take for a drive to be marked as "failed" by md RAID? Is there some threshold I can set? Second question: what's a better way of monitoring this proactively, rather than just waiting for applications to fail and then digging into dmesg? Recently I installed an excellent set of snmp plugins and MIBs for exposing both md-raid and smartctl information via SNMP, which I got from http://www.mad-hacking.net/software/index.xml http://downloads.mad-hacking.net/software/ Here's the md RAID output (which really is just reformatting of info from mdadm --detail) root@storage2:~# snmptable -c XXXXXXXX -v 2c storage2 MD-RAID-MIB::mdRaidTableSNMP table: MD-RAID-MIB::mdRaidTable mdRaidArrayIndex mdRaidArrayDev mdRaidArrayVersion mdRaidArrayUUID mdRaidArrayLevel mdRaidArrayLayout mdRaidArrayChunkSize mdRaidArraySize mdRaidArrayDeviceSize mdRaidArrayHealthOK mdRaidArrayHasFailedComponents mdRaidArrayHasAvailableSpares mdRaidArrayTotalComponents mdRaidArrayActiveComponents mdRaidArrayWorkingComponents mdRaidArrayFailedComponents mdRaidArraySpareComponents 1 /dev/md/scratch 1.2 e5d2dce6:91d1d3b9:ae08f838:5e12132a raid0 N/A 512K 70326362112 N/A true false false 24 24 24 0 0 And here's the output for SMART (which combines smartctl -i, -H and -A): root@storage2:~# snmptable -c XXXXXXXX -v 2c storage2 SMARTCTL-MIB::smartCtlTable SNMP table: SMARTCTL-MIB::smartCtlTable smartCtlDeviceIndex smartCtlDeviceDev smartCtlDeviceModelFamily smartCtlDeviceDeviceModel smartCtlDeviceSerialNumber smartCtlDeviceUserCapacity smartCtlDeviceATAVersion smartCtlDeviceHealthOK smartCtlDeviceTemperatureCelsius smartCtlDeviceReallocatedSectorCt smartCtlDeviceCurrentPendingSector smartCtlDeviceOfflineUncorrectable smartCtlDeviceUDMACRCErrorCount smartCtlDeviceReadErrorRate smartCtlDeviceSeekErrorRate smartCtlDeviceHardwareECCRecovered 1 /dev/sda ST1000DM003-9YN162 Z1D0BQHF 1,000,204,886,016 bytes [1.00 TB] 8 true 28 0 0 0 0 105 30 ? 2 /dev/sdb ST3000DM001-9YN166 S1F01Z36 3,000,592,982,016 bytes [3.00 TB] 8 true 28 0 0 0 0 105 31 ? 3 /dev/sdc ST3000DM001-9YN166 S1F01932 3,000,592,982,016 bytes [3.00 TB] 8 true 24 0 0 0 0 103 31 ? 4 /dev/sdd ST3000DM001-9YN166 S1F04Y7G 3,000,592,982,016 bytes [3.00 TB] 8 true 26 0 0 0 0 104 31 ? 5 /dev/sde ST3000DM001-9YN166 S1F00KF2 3,000,592,982,016 bytes [3.00 TB] 8 true 25 0 0 0 0 104 31 ? 6 /dev/sdf ST3000DM001-9YN166 S1F01C0D 3,000,592,982,016 bytes [3.00 TB] 8 true 27 0 0 0 0 103 31 ? 7 /dev/sdg ST3000DM001-9YN166 S1F01DFM 3,000,592,982,016 bytes [3.00 TB] 8 true 25 0 0 0 0 104 31 ? 8 /dev/sdh ST3000DM001-9YN166 S1F054EP 3,000,592,982,016 bytes [3.00 TB] 8 true 27 0 0 0 0 105 31 ? 9 /dev/sdi ST3000DM001-9YN166 S1F05304 3,000,592,982,016 bytes [3.00 TB] 8 true 25 0 0 0 0 105 31 ? 10 /dev/sdj ST3000DM001-9YN166 S1F015X5 3,000,592,982,016 bytes [3.00 TB] 8 true 25 0 0 0 0 105 31 ? 11 /dev/sdk ST3000DM001-9YN166 S1F046FB 3,000,592,982,016 bytes [3.00 TB] 8 true 27 0 0 0 0 103 31 ? 12 /dev/sdl ST3000DM001-9YN166 S1F024DW 3,000,592,982,016 bytes [3.00 TB] 8 true 26 0 0 0 0 103 31 ? 13 /dev/sdm ST3000DM001-9YN166 S1F04DKQ 3,000,592,982,016 bytes [3.00 TB] 8 true 25 0 0 0 0 104 31 ? 14 /dev/sdn ST3000DM001-9YN166 S1F014NH 3,000,592,982,016 bytes [3.00 TB] 8 true 25 0 0 0 0 104 31 ? 15 /dev/sdo ST3000DM001-9YN166 S1F049KM 3,000,592,982,016 bytes [3.00 TB] 8 true 26 0 0 0 0 105 31 ? 16 /dev/sdp ST3000DM001-9YN166 S1F01D5A 3,000,592,982,016 bytes [3.00 TB] 8 true 26 0 0 0 0 103 31 ? 17 /dev/sdq ST3000DM001-9YN166 S1F00L20 3,000,592,982,016 bytes [3.00 TB] 8 true 24 0 0 0 0 103 31 ? 18 /dev/sdr ST3000DM001-9YN166 S1F07PN8 3,000,592,982,016 bytes [3.00 TB] 8 true 28 0 8 8 0 81 31 ? 19 /dev/sds ST3000DM001-9YN166 S1F03PS8 3,000,592,982,016 bytes [3.00 TB] 8 true 25 0 0 0 0 104 31 ? 20 /dev/sdt ST3000DM001-9YN166 S1F04SM4 3,000,592,982,016 bytes [3.00 TB] 8 true 25 0 0 0 0 103 31 ? 21 /dev/sdu ST3000DM001-9YN166 S1F00MCQ 3,000,592,982,016 bytes [3.00 TB] 8 true 27 0 0 0 0 105 31 ? 22 /dev/sdv ST3000DM001-9YN166 S1F020YG 3,000,592,982,016 bytes [3.00 TB] 8 true 28 0 0 0 0 104 31 ? 23 /dev/sdw ST3000DM001-9YN166 S1F03NXP 3,000,592,982,016 bytes [3.00 TB] 8 true 26 0 0 0 0 103 31 ? 24 /dev/sdx ST3000DM001-9YN166 S1F054Y7 3,000,592,982,016 bytes [3.00 TB] 8 true 26 0 0 0 0 104 31 ? 25 /dev/sdy ST3000DM001-9YN166 S1F04A0Y 3,000,592,982,016 bytes [3.00 TB] 8 true 27 0 40 40 0 105 31 ? All drives report smartCtlDeviceHealthOK = True, which derives from the test "PASSED" result from smartctl -H: smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.0.0-16-server] (local build) Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED The only anomoly I can see here is that sdr has reported 8 unrecoverable errors - and also sdy has reported 40 unrecoverable errors! So based on this information, I am going to return sdr and sdy to the manufacturer for replacement. But is there any better way that I can be notified quickly of I/O errors and/or retries, for example counters being maintained in the kernel? Thanks, Brian. -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html