This PG/object is still doing something rather odd. Attempted to repair the object, which it supposedly attempted, but now I appear to have less visibility. > $ ceph health detail > HEALTH_ERR 3 pgs inconsistent; 4 scrub errors; mds0: Many clients (20) failing to respond to cache pressure; noout,sortbitwise,require_jewel_osds flag(s) set > pg 10.2d8 is active+clean+inconsistent, acting [18,17,22] > pg 10.7bd is active+clean+inconsistent, acting [8,23,17] > pg 17.ec is active+clean+inconsistent, acting [23,2,21] > 4 scrub errors > noout,sortbitwise,require_jewel_osds flag(s) set 23 is the osd scheduled for replacement, generated another read error. However, 17.ec does not show in the rados list inconsistent pg objects command > $ rados list-inconsistent-pg objects > ["10.2d8","10.7bd?] And examining 10.2d8 as before, I?m presented with this: > $ rados list-inconsistent-obj 10.2d8 --format=json-pretty > { > "epoch": 21094, > "inconsistents": [] > } Even though in the logs, the deep scrub and repair both show that the object was not repaired. > $ zgrep 10.2d8 ceph-* > ceph-osd.18.log.2.gz:2017-03-06 15:10:08.729827 7fc8dfeb8700 0 log_channel(cluster) log [INF] : 10.2d8 repair starts > ceph-osd.18.log.2.gz:2017-03-06 15:13:49.793839 7fc8dfeb8700 -1 log_channel(cluster) log [ERR] : 10.2d8 recorded data digest 0x7fa9879c != on disk 0xa6798e03 on {object.name}:head > ceph-osd.18.log.2.gz:2017-03-06 15:13:49.793941 7fc8dfeb8700 -1 log_channel(cluster) log [ERR] : repair 10.2d8 {object.name}:head on disk size (15913) does not match object info size (10280) adjusted for ondisk to (10280) > ceph-osd.18.log.2.gz:2017-03-06 15:46:13.286268 7fc8dfeb8700 -1 log_channel(cluster) log [ERR] : 10.2d8 repair 2 errors, 0 fixed > ceph-osd.18.log.4.gz:2017-03-04 18:16:23.693057 7fc8dd6b3700 0 log_channel(cluster) log [INF] : 10.2d8 deep-scrub starts > ceph-osd.18.log.4.gz:2017-03-04 18:19:25.471322 7fc8dfeb8700 -1 log_channel(cluster) log [ERR] : 10.2d8 recorded data digest 0x7fa9879c != on disk 0xa6798e03 on {object.name}:head > ceph-osd.18.log.4.gz:2017-03-04 18:19:25.471403 7fc8dfeb8700 -1 log_channel(cluster) log [ERR] : deep-scrub 10.2d8 {object.name}:head on disk size (15913) does not match object info size (10280) adjusted for ondisk to (10280) > ceph-osd.18.log.4.gz:2017-03-04 18:55:39.617841 7fc8dd6b3700 -1 log_channel(cluster) log [ERR] : 10.2d8 deep-scrub 2 errors File size and md5 still match. > ls -la /var/lib/ceph/osd/ceph-*/current/10.2d8_head/DIR_8/DIR_D/DIR_2/DIR_4/DIR_4/DIR_A/{object.name} > -rw-r--r-- 1 ceph ceph 15913 Mar 2 17:24 /var/lib/ceph/osd/ceph-17/current/10.2d8_head/DIR_8/DIR_D/DIR_2/DIR_4/DIR_4/DIR_A/{object.name} > -rw-r--r-- 1 ceph ceph 15913 Mar 2 17:24 /var/lib/ceph/osd/ceph-18/current/10.2d8_head/DIR_8/DIR_D/DIR_2/DIR_4/DIR_4/DIR_A/{object.name} > -rw-r--r-- 1 ceph ceph 15913 Mar 2 17:24 /var/lib/ceph/osd/ceph-22/current/10.2d8_head/DIR_8/DIR_D/DIR_2/DIR_4/DIR_4/DIR_A/{object.name} > md5sum /var/lib/ceph/osd/ceph-*/current/10.2d8_head/DIR_8/DIR_D/DIR_2/DIR_4/DIR_4/DIR_A/{object.name} > 55a76349b758d68945e5028784c59f24 /var/lib/ceph/osd/ceph-17/current/10.2d8_head/DIR_8/DIR_D/DIR_2/DIR_4/DIR_4/DIR_A/{object.name} > 55a76349b758d68945e5028784c59f24 /var/lib/ceph/osd/ceph-18/current/10.2d8_head/DIR_8/DIR_D/DIR_2/DIR_4/DIR_4/DIR_A/{object.name} > 55a76349b758d68945e5028784c59f24 /var/lib/ceph/osd/ceph-22/current/10.2d8_head/DIR_8/DIR_D/DIR_2/DIR_4/DIR_4/DIR_A/{object.name} So is the object actually inconsistent? Is rados somehow behind on something, not showing the third inconsistent PG? Appreciate any help. Reed > On Mar 2, 2017, at 9:21 AM, Reed Dier <reed.dier at focusvq.com> wrote: > > Over the weekend, two inconsistent PG?s popped up in my cluster. This being after having scrubs disabled for close to 6 weeks after a very long rebalance after adding 33% more OSD?s, an OSD failing, increasing PG?s, etc. > > It appears we came out the other end with 2 inconsistent PG?s and I?m trying to resolve them, and not seeming to have much luck. > Ubuntu 16.04, Jewel 10.2.5, 3x replicated pool for reference. > >> $ ceph health detail >> HEALTH_ERR 2 pgs inconsistent; 3 scrub errors; noout,sortbitwise,require_jewel_osds flag(s) set >> pg 10.7bd is active+clean+inconsistent, acting [8,23,17] >> pg 10.2d8 is active+clean+inconsistent, acting [18,17,22] >> 3 scrub errors > >> $ rados list-inconsistent-pg objects >> ["10.2d8","10.7bd?] > > Pretty straight forward, 2 PG?s with inconsistent copies. Lets dig deeper. > >> $ rados list-inconsistent-obj 10.2d8 --format=json-pretty >> { >> "epoch": 21094, >> "inconsistents": [ >> { >> "object": { >> "name": ?object.name", >> "nspace": ?namespace.name", >> "locator": "", >> "snap": "head" >> }, >> "errors": [], >> "shards": [ >> { >> "osd": 17, >> "size": 15913, >> "omap_digest": "0xffffffff", >> "data_digest": "0xa6798e03", >> "errors": [] >> }, >> { >> "osd": 18, >> "size": 15913, >> "omap_digest": "0xffffffff", >> "data_digest": "0xa6798e03", >> "errors": [] >> }, >> { >> "osd": 22, >> "size": 15913, >> "omap_digest": "0xffffffff", >> "data_digest": "0xa6798e03", >> "errors": [ >> "data_digest_mismatch_oi" >> ] >> } >> ] >> } >> ] >> } > >> $ rados list-inconsistent-obj 10.7bd --format=json-pretty >> { >> "epoch": 21070, >> "inconsistents": [ >> { >> "object": { >> "name": ?object2.name", >> "nspace": ?namespace.name", >> "locator": "", >> "snap": "head" >> }, >> "errors": [ >> "read_error" >> ], >> "shards": [ >> { >> "osd": 8, >> "size": 27691, >> "omap_digest": "0xffffffff", >> "data_digest": "0x9ce36903", >> "errors": [] >> }, >> { >> "osd": 17, >> "size": 27691, >> "omap_digest": "0xffffffff", >> "data_digest": "0x9ce36903", >> "errors": [] >> }, >> { >> "osd": 23, >> "size": 27691, >> "errors": [ >> "read_error" >> ] >> } >> ] >> } >> ] >> } > > > So we have one PG (10.7bd) with a read error on osd.23, which is known and scheduled for replacement. > We also have a data digest mismatch on PG 10.2d8 on osd.22, which I have been attempting to repair with no real tangible results. > >> $ ceph pg repair 10.2d8 >> instructing pg 10.2d8 on osd.18 to repair > > I?ve run the ceph pg repair command multiple times, and each time, it instructs osd.18 to repair to the PG. > Is this to assume that osd.18 is the acting member of the copies, and its being told to backfill the known-good copy of the PG over the agreed upon wrong version on osd.22. > >> $ zgrep 'ERR' /var/log/ceph/* >> /var/log/ceph/ceph-osd.18.log.7.gz:2017-02-23 20:45:21.561164 7fc8dfeb8700 -1 log_channel(cluster) log [ERR] : 10.2d8 recorded data digest 0x7fa9879c != on disk 0xa6798e03 on 10:1b42251f:{object.name}:head >> /var/log/ceph/ceph-osd.18.log.7.gz:2017-02-23 20:45:21.561225 7fc8dfeb8700 -1 log_channel(cluster) log [ERR] : deep-scrub 10.2d8 10:1b42251f:{object.name}:head on disk size (15913) does not match object info size (10280) adjusted for ondisk to (10280) >> /var/log/ceph/ceph-osd.18.log.7.gz:2017-02-23 21:05:59.935815 7fc8dfeb8700 -1 log_channel(cluster) log [ERR] : 10.2d8 deep-scrub 2 errors > > >> $ ceph pg 10.2d8 query >> { >> "state": "active+clean+inconsistent", >> "snap_trimq": "[]", >> "epoch": 21746, >> "up": [ >> 18, >> 17, >> 22 >> ], >> "acting": [ >> 18, >> 17, >> 22 >> ], >> "actingbackfill": [ >> "17", >> "18", >> "22" >> ], > > However, no recovery io ever occurs, and the PG never goes active+clean. Not seeing anything exciting in the logs of the OSD?s nor the mon?s. > > I?ve found a few articles and mailing list entries that mention downing the OSD, flushing the journal, moving object off the disk, starting the OSD, and running the repair command again. > > However, after finding the object on disk, and eyeballing the size and the md5sum, they all appear to be identical. >> $ ls -la /var/lib/ceph/osd/ceph-*/current/10.2d8_head/DIR_8/DIR_D/DIR_2/DIR_4/DIR_4/DIR_A/{object.name} >> -rw-r--r-- 1 ceph ceph 15913 Jan 27 02:31 /var/lib/ceph/osd/ceph-17/current/10.2d8_head/DIR_8/DIR_D/DIR_2/DIR_4/DIR_4/DIR_A/{object.name} >> -rw-r--r-- 1 ceph ceph 15913 Jan 27 02:31 /var/lib/ceph/osd/ceph-18/current/10.2d8_head/DIR_8/DIR_D/DIR_2/DIR_4/DIR_4/DIR_A/{object.name} >> -rw-r--r-- 1 ceph ceph 15913 Jan 27 02:31 /var/lib/ceph/osd/ceph-22/current/10.2d8_head/DIR_8/DIR_D/DIR_2/DIR_4/DIR_4/DIR_A/{object.name} > >> $ md5sum /var/lib/ceph/osd/ceph-*/current/10.2d8_head/DIR_8/DIR_D/DIR_2/DIR_4/DIR_4/DIR_A/{object.name} >> 55a76349b758d68945e5028784c59f24 /var/lib/ceph/osd/ceph-17/current/10.2d8_head/DIR_8/DIR_D/DIR_2/DIR_4/DIR_4/DIR_A/{object.name} >> 55a76349b758d68945e5028784c59f24 /var/lib/ceph/osd/ceph-18/current/10.2d8_head/DIR_8/DIR_D/DIR_2/DIR_4/DIR_4/DIR_A/{object.name} > >> 55a76349b758d68945e5028784c59f24 /var/lib/ceph/osd/ceph-22/current/10.2d8_head/DIR_8/DIR_D/DIR_2/DIR_4/DIR_4/DIR_A/{object.name} > > Should I schedule another scrub? Should I do the whole down the OSD, flush journal, move object song and dance? > > Hoping the user list will provide some insight into the proper steps to move forward with. And assuming the other inconsistent PG will fix itself once the > > Thanks, > > Reed -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20170308/f8704254/attachment.htm>