Jeff Garzik wrote:
Theodore Tso wrote:
Can someone with knowledge of current disk drive behavior confirm that
for all drives that support bad block sparing, if an attempt to write
to a particular spot on disk results in an error due to bad media at
that spot, the disk drive will automatically rewrite the sector to a
sector in its spare pool, and automatically redirect that sector to
the new location. I believe this should be always true, so presumably
with all modern disk drives a write error should mean something very
serious has happend.
This is what will /probably/ happen. The drive should indeed find a
spare sector and remap it, if the write attempt encounters a bad spot on
the media.
However, with a large enough write, large enough bad-spot-on-media, and
a firmware programmed to never take more than X seconds to complete
their enterprise customers' I/O, it might just fail.
IMO, somewhere in the kernel, when we receive a read-op or write-op
media error, we should immediately try to plaster that area with small
writes. Sure, if it's a read-op you lost data, but this method will
maximize the chance that you can refresh/reuse the logical sectors in
question.
Jeff
One interesting counter example is a smaller write than a full page - say 512
bytes out of 4k.
If we need to do a read-modify-write and it just so happens that 1 of the 7
sectors we need to read is flaky, will this "look" like a write failure?
ric
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html