Re: fixing unrepairable inconsistent PG

Andrei Mikhailovsky <andrei@xxxxxxxxxx> · Tue, 19 Jun 2018 14:44:32 +0100 (BST)

A quick update on my issue. I have noticed that while I was trying to move the problem object on osds, the file attributes got lost on one of the osds, which is I guess why the error messages showed the no attribute bit. 

I then copied the attributes metadata to the problematic object and restarted the osds in question. Following a pg repair I got a different error:

2018-06-19 13:51:05.846033 osd.21 osd.21 192.168.168.203:6828/24339 2 : cluster [ERR] 18.2 shard 21: soid 18:45f87722:::.dir.default.80018061.2:head omap_digest 0x25e8a1da != omap_digest 0x21c7f871 from auth oi 18:45f87722:::.dir.default.80018061.2:head(106137'603495 osd.21.0:41403910 dirty|omap|data_digest|omap_digest s 0 uv 603494 dd ffffffff od 21c7f871 alloc_hint [0 0 0])
2018-06-19 13:51:05.846042 osd.21 osd.21 192.168.168.203:6828/24339 3 : cluster [ERR] 18.2 shard 28: soid 18:45f87722:::.dir.default.80018061.2:head omap_digest 0x25e8a1da != omap_digest 0x21c7f871 from auth oi 18:45f87722:::.dir.default.80018061.2:head(106137'603495 osd.21.0:41403910 dirty|omap|data_digest|omap_digest s 0 uv 603494 dd ffffffff od 21c7f871 alloc_hint [0 0 0])
2018-06-19 13:51:05.846046 osd.21 osd.21 192.168.168.203:6828/24339 4 : cluster [ERR] 18.2 soid 18:45f87722:::.dir.default.80018061.2:head: failed to pick suitable auth object
2018-06-19 13:51:05.846118 osd.21 osd.21 192.168.168.203:6828/24339 5 : cluster [ERR] repair 18.2 18:45f87722:::.dir.default.80018061.2:head no '_' attr
2018-06-19 13:51:05.846129 osd.21 osd.21 192.168.168.203:6828/24339 6 : cluster [ERR] repair 18.2 18:45f87722:::.dir.default.80018061.2:head no 'snapset' attr
2018-06-19 13:51:09.810878 osd.21 osd.21 192.168.168.203:6828/24339 7 : cluster [ERR] 18.2 repair 4 errors, 0 fixed

It mentions that there is an incorrect omap_digest . How do I go about fixing this?

Cheers

From: "andrei" <andrei@xxxxxxxxxx>
To: "ceph-users" <ceph-users@xxxxxxxxxxxxxx>
Sent: Tuesday, 19 June, 2018 11:16:22
Subject:  fixing unrepairable inconsistent PG
Hello everyone

I am having trouble repairing one inconsistent and stubborn PG. I get the following error in ceph.log:

2018-06-19 11:00:00.000225 mon.arh-ibstorage1-ib mon.0 192.168.168.201:6789/0 675 : cluster [ERR] overall HEALTH_ERR noout flag(s) set; 4 scrub errors; Possible data damage: 1 pg inconsistent; application not enabled on 4 pool(s)
2018-06-19 11:09:24.586392 mon.arh-ibstorage1-ib mon.0 192.168.168.201:6789/0 841 : cluster [ERR] Health check update: Possible data damage: 1 pg inconsistent, 1 pg repair (PG_DAMAGED)
2018-06-19 11:09:27.139504 osd.21 osd.21 192.168.168.203:6828/4003 2 : cluster [ERR] 18.2 soid 18:45f87722:::.dir.default.80018061.2:head: failed to pick suitable object info
2018-06-19 11:09:27.139545 osd.21 osd.21 192.168.168.203:6828/4003 3 : cluster [ERR] repair 18.2 18:45f87722:::.dir.default.80018061.2:head no '_' attr
2018-06-19 11:09:27.139550 osd.21 osd.21 192.168.168.203:6828/4003 4 : cluster [ERR] repair 18.2 18:45f87722:::.dir.default.80018061.2:head no 'snapset' attr

2018-06-19 11:09:35.484402 osd.21 osd.21 192.168.168.203:6828/4003 5 : cluster [ERR] 18.2 repair 4 errors, 0 fixed
2018-06-19 11:09:40.601657 mon.arh-ibstorage1-ib mon.0 192.168.168.201:6789/0 844 : cluster [ERR] Health check update: Possible data damage: 1 pg inconsistent (PG_DAMAGED)

I have tried to follow a few instructions on the PG repair, including removal of the 'broken' object .dir.default.80018061.2 from primary osd following by the pg repair. After that didn't work, I've done the same for the secondary osd. Still the same issue.

Looking at the actual object on the file system, the file size is 0 for both primary and secondary objects. The md5sum is the same too. The broken PG belongs to the radosgw bucket called .rgw.buckets.index

What else can I try to get the thing fixed?

Cheers

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com