On Mon, Jan 31, 2011 at 2:11 PM, Jim Schutt <jaschut@xxxxxxxxxx> wrote: > Most drives have some spare sectors that they internally remap over > bad sectors. Such a bad sector can be "healed" by rewriting that > LBA, because the drive will internally remap it. You can use > smartctl on drives that support it to learn how many sectors have > had read errors, of both the recoverable and unrecoverable variety, > I think, and also how many spare sectors are remaining. > > So one of the things I'd really like Ceph scrub to do when it is > reading every one of its objects is rewrite the ones on which it > gets read errors, by fetching one of its other copies. Maybe it > already does this? Yeah, we should definitely check the return codes when we do scrub operations on the filestore. If we get an error, we should fetch that object from another replica. This is something we're committed to doing in theory, but I don't think any system tests exist yet that measure our progress towards getting it right. Handling disk errors is tough because the normal use cases don't really test it. > If Ceph scrub works this way, then another > thing I really want to do is learn to tell my disks to not try > so hard to recover a sector, as I know I have at least one other > copy I can use to repair it, and because that minimizes the time > that osd is stalled. Do you set these kind of timeouts through smartctl or hdparm? > If such a bad sector is used for btrfs metadata, well, AFAIK > btrfs duplicates its metadata by default (see mkfs.btrfs -m single). > Presumably if it gets a metadata read failure it will rewrite the > offending sectors using its other copy? If not, we should try > to convince the btrfs devs to do so. Yeah, definitely. Colin -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html