Hi again,
after two weeks I've got another inconsistent PG in same cluster. OSD's are different from first PG, object can not be GET as well:
# rados list-inconsistent-obj 26.821 --format=json-pretty
# rados list-inconsistent-obj 26.821 --format=json-pretty
{
"epoch": 178472,
"inconsistents": [
{
"object": {
"name": "default.122888368.52__shadow_.3ubGZwLcz0oQ55-LTb7PCOTwKkv-nQf_7",
"nspace": "",
"locator": "",
"snap": "head",
"version": 118920
},
"errors": [],
"union_shard_errors": [
"data_digest_mismatch_oi"
],
"selected_object_info": "26:8411bae4:::default.122888368.52__shadow_.3ubGZwLcz0oQ55-LTb7PCOTwKkv-nQf_7:head(126495'118920 client.142609570.0:41412640 dirty|data_digest|omap_digest s 4194304 uv 118920 dd cd142aaa od ffffffff alloc_hint [0 0])",
"shards": [
{
"osd": 20,
"errors": [
"data_digest_mismatch_oi"
],
"size": 4194304,
"omap_digest": "0xffffffff",
"data_digest": "0x6b102e59"
},
{
"osd": 44,
"errors": [
"data_digest_mismatch_oi"
],
"size": 4194304,
"omap_digest": "0xffffffff",
"data_digest": "0x6b102e59"
}
]
}
]
}
# rados -p .rgw.buckets get default.122888368.52__shadow_.3ubGZwLcz0oQ55-LTb7PCOTwKkv-nQf_7 test_2pg.file
error getting .rgw.buckets/default.122888368.52__shadow_.3ubGZwLcz0oQ55-LTb7PCOTwKkv-nQf_7: (5) Input/output error
Still struggling how to solve it. Any ideas, guys?
Thank you
On Tue, Jul 24, 2018 at 10:27 AM, Arvydas Opulskis <zebediejus@xxxxxxxxx> wrote:
Hello, Cephers,after trying different repair approaches I am out of ideas how to repair inconsistent PG. I hope, someones sharp eye will notice what I overlooked.Some info about cluster:Centos 7.4Jewel 10.2.10Pool size 2 (yes, I know it's a very bad choice)Pool with inconsistent PG: .rgw.bucketsAfter routine deep-scrub I've found PG 26.c3f in inconsistent status. While running "ceph pg repair 26.c3f" command and monitoring "ceph -w" log, I noticed these errors:2018-07-24 08:28:06.517042 osd.36 [ERR] 26.c3f shard 30: soid 26:fc32a1f1:::default.142609570.87_20180206.093111% 2frepositories%2fnuget-local% 2fApplication%2fCompany. Application.Api%2fCompany. Application.Api.1.1.1.nupkg. artifactory-metadata% 2fproperties.xml:head data_digest 0x540e4f8b != data_digest 0x49a34c1f from auth oi 26:e261561a:::default. 168602061.10_team-xxx.xxx- jobs.H6.HADOOP.data- segmentation.application.131. xxx-jvm.cpu.load%2f2018-05- 05T03%3a51%3a39+00%3a00.sha1: head(167828'216051 client.179334015.0:1847715760 dirty|data_digest|omap_digest s 40 uv 216051 dd 49a34c1f od ffffffff alloc_hint [0 0]) 2018-07-24 08:28:06.517118 osd.36 [ERR] 26.c3f shard 36: soid 26:fc32a1f1:::default.142609570.87_20180206.093111% 2frepositories%2fnuget-local% 2fApplication%2fCompany. Application.Api%2fCompany. Application.Api.1.1.1.nupkg. artifactory-metadata% 2fproperties.xml:head data_digest 0x540e4f8b != data_digest 0x49a34c1f from auth oi 26:e261561a:::default. 168602061.10_team-xxx.xxx- jobs.H6.HADOOP.data- segmentation.application.131. xxx-jvm.cpu.load%2f2018-05- 05T03%3a51%3a39+00%3a00.sha1: head(167828'216051 client.179334015.0:1847715760 dirty|data_digest|omap_digest s 40 uv 216051 dd 49a34c1f od ffffffff alloc_hint [0 0]) 2018-07-24 08:28:06.517122 osd.36 [ERR] 26.c3f soid 26:fc32a1f1:::default.142609570.87_20180206.093111% 2frepositories%2fnuget-local% 2fApplication%2fCompany. Application.Api%2fCompany. Application.Api.1.1.1.nupkg. artifactory-metadata% 2fproperties.xml:head: failed to pick suitable auth object ...and same errors about another object on same PG.Repair failed, so I checked inconsistencies "rados list-inconsistent-obj 26.c3f --format=json-pretty":{"epoch": 178403,"inconsistents": [{"object": {"name": "default.142609570.87_20180203.020047\/repositories\ /docker-local\/yyy\/company. yyy.api.assets\/1.2.4\/sha256_ _ ce41e5246ead8bddd2a2b5bbb863db 250f328be9dc5c3041481d778a32f8 130d", "nspace": "","locator": "","snap": "head","version": 217749},"errors": [],"union_shard_errors": ["data_digest_mismatch_oi"],"selected_object_info": "26:f4ce1748:::default.168602061.10_team-xxx.xxx- jobs.H6.HADOOP.data- segmentation.application.131. xxx-jvm.cpu.load%2f2018-05- 08T03%3a45%3a15+00%3a00.sha1: head(167944'217749 client.177936559.0:1884719302 dirty|data_digest|omap_digest s 40 uv 217749 dd 422f251b od ffffffff alloc_hint [0 0])", "shards": [{"osd": 30,"errors": ["data_digest_mismatch_oi"],"size": 40,"omap_digest": "0xffffffff","data_digest": "0x551c282f"},{"osd": 36,"errors": ["data_digest_mismatch_oi"],"size": 40,"omap_digest": "0xffffffff","data_digest": "0x551c282f"}]},{"object": {"name": "default.142609570.87_20180206.093111\/repositories\ /nuget-local\/Application\/ Company.Application.Api\/ Company.Application.Api.1.1.1. nupkg.artifactory-metadata\/ properties.xml", "nspace": "","locator": "","snap": "head","version": 216051},"errors": [],"union_shard_errors": ["data_digest_mismatch_oi"],"selected_object_info": "26:e261561a:::default.168602061.10_team-xxx.xxx- jobs.H6.HADOOP.data- segmentation.application.131. xxx-jvm.cpu.load%2f2018-05- 05T03%3a51%3a39+00%3a00.sha1: head(167828'216051 client.179334015.0:1847715760 dirty|data_digest|omap_digest s 40 uv 216051 dd 49a34c1f od ffffffff alloc_hint [0 0])", "shards": [{"osd": 30,"errors": ["data_digest_mismatch_oi"],"size": 40,"omap_digest": "0xffffffff","data_digest": "0x540e4f8b"},{"osd": 36,"errors": ["data_digest_mismatch_oi"],"size": 40,"omap_digest": "0xffffffff","data_digest": "0x540e4f8b"}]}]}After some reading, I understand, I needed rados get/put trick to solve this problem. I couldn't do rados get, because I was getting "no such file" error, even objects were listed by "rados ls" command, so I got them directly from OSD. After putting them back to rados (rados commands doesn't returned any errors) and doing deep-scrub on same PG, problem still existed. The only thing changed - when I try to get object via rados now I get "(5) Input/output error".I tried force object size to 40 (it's real size of both objects) by adding "-o 40" option to "rados put" command, but with no luck.Guys, maybe you have other ideas what to try? Why overwriting object doesn't solve this problem?Thanks a lot!Arvydas
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com