Hi all,
I have been receiving alerts for:
Possible data damage: 1 pg inconsistent
almost daily for a few weeks now. When I check:
rados list-inconsistent-obj $PG --format=json-pretty
I will always see a read_error. When I run a deep scrub on the PG I will see:
head candidate had a read error
When I check dmesg on the osd node I see:
blk_update_request: critical medium error, dev sdX, sector 123
I will also see a few uncorrected read errors in smartctl.
Info:
Ceph: ceph version 12.2.4-30.el7cp
OSD: Toshiba 1.8TB SAS 10K
120 OSDs total
Has anyone else seen these alerts occur almost daily? Can the errors possibly be due to deep scrubbing too aggressively?
I realize these errors indicate potential failing drives but I can't replace a drive daily.
thx
Frank
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com