Hi, # ceph health detail HEALTH_ERR 3 scrub errors; Possible data damage: 1 pg inconsistent OSD_SCRUB_ERRORS 3 scrub errors PG_DAMAGED Possible data damage: 1 pg inconsistent pg 2.2bb is active+clean+inconsistent, acting [36,12,80] # ceph pg repair 2.2bb instructing pg 2.2bb on osd.36 to repair But: 2019-03-07 13:23:38.636881 [ERR] Health check update: Possible data damage: 1 pg inconsistent, 1 pg repair (PG_DAMAGED) 2019-03-07 13:20:38.373431 [ERR] 2.2bb deep-scrub 3 errors 2019-03-07 13:20:38.373426 [ERR] 2.2bb deep-scrub 0 missing, 1 inconsistent objects 2019-03-07 13:20:43.486860 [ERR] Health check update: 3 scrub errors (OSD_SCRUB_ERRORS) 2019-03-07 13:19:17.741350 [ERR] deep-scrub 2.2bb 2:dd4a7bd3:::rbd_data.dfd5e2235befd0.000000000001c299:4f986 : is an unexpected clone 2019-03-07 13:19:17.523042 [ERR] 2.2bb shard 36 soid 2:dd4a7bd3:::rbd_data.dfd5e2235befd0.000000000001c299:4f986 : data_digest 0xffffffff != data_digest 0xfc6b9538 from shard 12, size 0 != size 4194304 from auth oi 2:dd4a7bd3:::rbd_data.dfd5e2235befd0.000000000001c299:4f986(482757'14986708 client.112595650.0:344888465 dirty|omap_digest s 4194304 uv 14974021 od ffffffff alloc_hint [0 0 0]), size 0 != size 4194304 from shard 12 2019-03-07 13:19:17.523038 [ERR] 2.2bb shard 36 soid 2:dd4a7bd3:::rbd_data.dfd5e2235befd0.000000000001c299:4f986 : candidate size 0 info size 4194304 mismatch 2019-03-07 13:16:48.542673 [ERR] 2.2bb repair 2 errors, 1 fixed 2019-03-07 13:16:48.542656 [ERR] 2.2bb repair 1 missing, 0 inconsistent objects 2019-03-07 13:16:53.774956 [ERR] Health check update: Possible data damage: 1 pg inconsistent (PG_DAMAGED) 2019-03-07 13:16:53.774916 [ERR] Health check update: 2 scrub errors (OSD_SCRUB_ERRORS) 2019-03-07 13:15:16.986872 [ERR] repair 2.2bb 2:dd4a7bd3:::rbd_data.dfd5e2235befd0.000000000001c299:4f986 : is an unexpected clone 2019-03-07 13:15:16.986817 [ERR] 2.2bb shard 36 2:dd4a7bd3:::rbd_data.dfd5e2235befd0.000000000001c299:4f986 : missing 2019-03-07 13:12:18.517442 [ERR] Health check update: Possible data damage: 1 pg inconsistent, 1 pg repair (PG_DAMAGED) Also tried deep-scrub and scrub, same results. Also set noscrub,nodeep-scrub, kicked currently active scrubs one at a time using 'ceph osd down <id>'. After the last scrub was kicked, forced scrub ran immediately then 'ceph pg repair', no luck. Finally tryed the manual aproach: - stop osd.36 - flush-journal - rm rbd\udata.dfd5e2235befd0.000000000001c299__4f986_CBDE52BB__2 - start osd.36 - ceph pg repair 2.2bb Also no luck... rbd\udata.dfd5e2235befd0.000000000001c299__4f986_CBDE52BB__2 at osd.36 is empty (0 size). At osd.80 4.0M, osd.2 is bluestore (can't find it). Ceph is 12.2.10, I'm currently migrating all my OSDs to bluestore. Is there anything else I can do? # rados list-inconsistent-obj 2.2bb | jq { "epoch": 484655, "inconsistents": [ { "object": { "name": "rbd_data.dfd5e2235befd0.000000000001c299", "nspace": "", "locator": "", "snap": 326022, "version": 14974021 }, "errors": [ "data_digest_mismatch", "size_mismatch" ], "union_shard_errors": [ "size_mismatch_info", "obj_size_info_mismatch" ], "selected_object_info": { "oid": { "oid": "rbd_data.dfd5e2235befd0.000000000001c299", "key": "", "snapid": 326022, "hash": 3420345019, "max": 0, "pool": 2, "namespace": "" }, "version": "482757'14986708", "prior_version": "482697'14980304", "last_reqid": "client.112595650.0:344888465", "user_version": 14974021, "size": 4194304, "mtime": "2019-03-02 22:30:23.812849", "local_mtime": "2019-03-02 22:30:23.813281", "lost": 0, "flags": [ "dirty", "omap_digest" ], "legacy_snaps": [], "truncate_seq": 0, "truncate_size": 0, "data_digest": "0xffffffff", "omap_digest": "0xffffffff", "expected_object_size": 0, "expected_write_size": 0, "alloc_hint_flags": 0, "manifest": { "type": 0, "redirect_target": { "oid": "", "key": "", "snapid": 0, "hash": 0, "max": 0, "pool": -9223372036854776000, "namespace": "" } }, "watchers": {} }, "shards": [ { "osd": 12, "primary": false, "errors": [], "size": 4194304, "omap_digest": "0xffffffff", "data_digest": "0xfc6b9538" }, { "osd": 36, "primary": true, "errors": [ "size_mismatch_info", "obj_size_info_mismatch" ], "size": 0, "omap_digest": "0xffffffff", "data_digest": "0xffffffff", "object_info": { "oid": { "oid": "rbd_data.dfd5e2235befd0.000000000001c299", "key": "", "snapid": 326022, "hash": 3420345019, "max": 0, "pool": 2, "namespace": "" }, "version": "482757'14986708", "prior_version": "482697'14980304", "last_reqid": "client.112595650.0:344888465", "user_version": 14974021, "size": 4194304, "mtime": "2019-03-02 22:30:23.812849", "local_mtime": "2019-03-02 22:30:23.813281", "lost": 0, "flags": [ "dirty", "omap_digest" ], "legacy_snaps": [], "truncate_seq": 0, "truncate_size": 0, "data_digest": "0xffffffff", "omap_digest": "0xffffffff", "expected_object_size": 0, "expected_write_size": 0, "alloc_hint_flags": 0, "manifest": { "type": 0, "redirect_target": { "oid": "", "key": "", "snapid": 0, "hash": 0, "max": 0, "pool": -9223372036854776000, "namespace": "" } }, "watchers": {} } }, { "osd": 80, "primary": false, "errors": [], "size": 4194304, "omap_digest": "0xffffffff", "data_digest": "0xfc6b9538" } ] } ] } -- Herbert