inconsistencies from read errors during scrub

Dan van der Ster <dan@xxxxxxxxxxxxxx> · Thu, 21 Apr 2016 13:23:34 +0200

Hi cephalapods,

In our couple years of operating a large Ceph cluster, every single
inconsistency I can recall was caused by a failed read during
deep-scrub. In other words, deep scrub reads an object, the read fails
with dmesg reporting "Sense Key : Medium Error [current]", "Add.
Sense: Unrecovered read error", "blk_update_request: critical medium
error", but the ceph-osd keeps on running and serving up data.

The incorrect solution to these inconsistencies would be to repair the
PG -- in every case a subsequent smart long test shows that the drive
is indeed failing.

Instead, the correct solution is to stop the OSD, let Ceph backfill,
then deep-scrub the affected PG.

So I'm curious, why doesn't the OSD exit FAILED when a read fails
during deep scrub (or any time a read fails)? Failed writes certainly
cause the OSD to exit -- why not reads?

Best Regards,
Dan van der Ster
CERN IT
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com