Hi,
you don't need to stop the OSDs, just query the inconsistent object,
here's a recent example (form an older cluster though):
---snip---
health: HEALTH_ERR
1 scrub errors
Possible data damage: 1 pg inconsistent
admin:~ # ceph health detail
HEALTH_ERR 1 scrub errors; Possible data damage: 1 pg inconsistent
OSD_SCRUB_ERRORS 1 scrub errors
PG_DAMAGED Possible data damage: 1 pg inconsistent
pg 7.17a is active+clean+inconsistent, acting [15,2,58,33,28,69]
admin:~ # rados -p cephfs_data list-inconsistent-obj 7.17a | jq
[...]
"shards": [
{
"osd": 2,
"primary": false,
"errors": [],
"size": 2780496,
"omap_digest": "0xffffffff",
"data_digest": "0x11e1764c"
},
{
"osd": 15,
"primary": true,
"errors": [],
"size": 2780496,
"omap_digest": "0xffffffff",
"data_digest": "0x11e1764c"
},
{
"osd": 28,
"primary": false,
"errors": [],
"size": 2780496,
"omap_digest": "0xffffffff",
"data_digest": "0x11e1764c"
},
{
"osd": 33,
"primary": false,
"errors": [
"read_error"
],
"size": 2780496
},
{
"osd": 58,
"primary": false,
"errors": [],
"size": 2780496,
"omap_digest": "0xffffffff",
"data_digest": "0x11e1764c"
},
{
"osd": 69,
"primary": false,
"errors": [],
"size": 2780496,
"omap_digest": "0xffffffff",
"data_digest": "0x11e1764c"
---snip---
Five of the six omap_digest and data_digest values were identical, so
it was safe to run 'ceph pg repair 7.17a'.
Regards,
Eugen
Zitat von E Taka <0etaka0@xxxxxxxxx>:
(17.2.4, 3 replicated, Container install)
Hello,
since many of the information found in the WWW or books is outdated, I want
to ask which procedure is recommended to repair damaged PG with status
active+clean+inconsistent for Ceph Quincy.
IMHO, the best process for a pool with 3 replicas it would be to check if
two of the replicas are identical and replace the third different one.
If I understand it correctly, the ceph-objectstore-tool could be used for
this approach, but unfortunately it is difficult even to start, especially
in a Docker environment. (OSD have to marked as "down", the Ubuntu package
ceph-osd, where ceph-objectstore-tool is included, starts server processes
which confuse the dockerized environment).
Is “ceph pg repair” safe to use, and is there a risk to enable
osd_scrub_auto_repair and osd_repair_during_recovery?
Thanks!
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx