Re: inconsistencies from read errors during scrub

Dan van der Ster <dan@xxxxxxxxxxxxxx> · Thu, 21 Apr 2016 13:30:00 +0200

On Thu, Apr 21, 2016 at 1:23 PM, Dan van der Ster <dan@xxxxxxxxxxxxxx> wrote:
> Hi cephalapods,
>
> In our couple years of operating a large Ceph cluster, every single
> inconsistency I can recall was caused by a failed read during
> deep-scrub. In other words, deep scrub reads an object, the read fails
> with dmesg reporting "Sense Key : Medium Error [current]", "Add.
> Sense: Unrecovered read error", "blk_update_request: critical medium
> error", but the ceph-osd keeps on running and serving up data.

I forgot to mention that the OSD notices the read error. In jewel it prints:

<objectname>:head got -5 on read, read_error

So why no assert?

Cheers, Dan

>
> The incorrect solution to these inconsistencies would be to repair the
> PG -- in every case a subsequent smart long test shows that the drive
> is indeed failing.
>
> Instead, the correct solution is to stop the OSD, let Ceph backfill,
> then deep-scrub the affected PG.
>
>
> So I'm curious, why doesn't the OSD exit FAILED when a read fails
> during deep scrub (or any time a read fails)? Failed writes certainly
> cause the OSD to exit -- why not reads?
>
> Best Regards,
> Dan van der Ster
> CERN IT
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com