Re: faulty disk testing

Tejun Heo <htejun@xxxxxxxxx> · Tue, 05 Sep 2006 16:56:34 +0200

Ric Wheeler wrote:
One of the problems is that currently libata EH can take some minutes 
recovering from an error condition.  With partial request retry from 
sd,  a batch of consecutive bad sectors can make recovery take a 
really long time.  This needs fixing.

So far, the new-init build has been running the recovery in the lab for 
about 40 minutes ;-)

Ouch.  that's long.  BTW, from the log you posted.

sd 1:0:0:0: SCSI error: return code = 0x08000002
sdb: Current: sense key: Medium Error
    Additional sense: Unrecovered read error - auto reallocate failed
end_request: I/O error, dev sdb, sector 272900
Buffer I/O error on device sdb3, logical block 208640
Buffer I/O error on device sdb3, logical block 208641
Buffer I/O error on device sdb3, logical block 208642
Buffer I/O error on device sdb3, logical block 208643
Buffer I/O error on device sdb3, logical block 208644
Buffer I/O error on device sdb3, logical block 208645
Buffer I/O error on device sdb3, logical block 208646
Buffer I/O error on device sdb3, logical block 208647

This is sd failing the request and the error completion propagating 
through fs/buffer and thus back to its user - probably md.  It's a bit 
weird that md doesn't drop the device at this point.  I think it could 
be that special metadata path thing you mentioned.

--
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html