On Thu Jan 31, 2013 at 10:46:17 -0700, Chris Murphy wrote: > > On Jan 31, 2013, at 6:15 AM, Christoph Nelles <evilazrael@xxxxxxxxxxxxx> wrote: > > > All drives are available again. And the seecond failed device reports > > UREs. I will run badblocks on that device before continuing. > > I attached the kernel logs of the first error and of the second error. I > > hope i filtered them reasonably. > > This looks like a write error, resulting in md immediately booting the > drive. There's little point in using this drive again. > > Jan 28 00:23:36 router kernel: Write(16): 8a 00 00 00 00 01 36 b2 55 50 00 00 00 30 00 00 > Jan 28 00:23:36 router kernel: end_request: I/O error, dev sdg, sector 5212624208 > It's definitely a write error, yes. If there's nothing further back in the log (e.g. a read error that's caused a rewrite to take place) then this would definitely warn against the drive, but could just be a transient error (or a controller problem). If there is a read error further back then I'd blame it on timeout issues, with the drive still trying to complete the read operation while the kernel's timed out and trying to send a write. > What does smartctl -a return for this drive? > > > > Exactly. I am running badblocks on that device. SMART reports one > > "Pending Sector Count" :( > > I'm unclear on the efficacy of badblocks for testing. I'd use smartctl > -t long and then -a to see if there are sector problems and at what > LBA; and for removing bad blocks (force a remap) I'd use either dd > zeros with e.g. bs=1M, or I'd use ATA Secure Erase which is faster. > I don't usually bother with read tests - as you say, they're not terribly useful. If the data's useful then just use ddrescue to get what you can, otherwise just write-test it. I usually do a full destructive badblocks test (I've found cases where zeros write fine but other patterns fail), followed by a long SMART test. > If you use the badblocks map when formatting a drive, e.g. using > mkfs.ext4 -c, then it would allow you to use this disk but not in > RAID. On top of raid, md gets the write error before the file system > does, and boots the drive out of the array. Or on read error attempts > to correct it. And even as a standalone drive do you really want to > use a drive that can't remap future bad sectors? > Not a chance I'd use it if it's actually failing to remap bad sectors, no. Only had that with one drive so far though (out of several hundred), most get failed out after getting more than a handful of remapped sectors. Cheers, Robin -- ___ ( ' } | Robin Hill <robin@xxxxxxxxxxxxxxx> | / / ) | Little Jim says .... | // !! | "He fallen in de water !!" |
Attachment:
pgp18_G3shf7p.pgp
Description: PGP signature