Plus do this as well: # rados list-inconsistent-obj ${PG ID} On Fri, Dec 23, 2016 at 7:08 PM, Brad Hubbard <bhubbard@xxxxxxxxxx> wrote: > Could you also try this? > > $ attr -l ./DIR_1/DIR_F/DIR_3/DIR_9/DIR_8/1000187bb70.00000009__head_EED893F1__6 > > Take note of any of ceph._, ceph._@1, ceph._@2, etc. > > For me on my test cluster it looks like this. > > $ attr -l dev/osd1/current/0.3_head/benchmark\\udata\\urskikr.localdomain\\u16952\\uobject99__head_2969453B__0 > Attribute "cephos.spill_out" has a 2 byte value for > dev/osd1/current/0.3_head/benchmark\udata\urskikr.localdomain\u16952\uobject99__head_2969453B__0 > Attribute "ceph._" has a 250 byte value for > dev/osd1/current/0.3_head/benchmark\udata\urskikr.localdomain\u16952\uobject99__head_2969453B__0 > Attribute "ceph.snapset" has a 31 byte value for > dev/osd1/current/0.3_head/benchmark\udata\urskikr.localdomain\u16952\uobject99__head_2969453B__0 > Attribute "ceph._@1" has a 53 byte value for > dev/osd1/current/0.3_head/benchmark\udata\urskikr.localdomain\u16952\uobject99__head_2969453B__0 > Attribute "selinux" has a 37 byte value for > dev/osd1/current/0.3_head/benchmark\udata\urskikr.localdomain\u16952\uobject99__head_2969453B__0 > > Then dump out ceph._ to a file and append all ceph._@X attributes like so. > > $ attr -q -g ceph._ > dev/osd1/current/0.3_head/benchmark\\udata\\urskikr.localdomain\\u16952\\uobject99__head_2969453B__0 >> /tmp/attr1 > $ attr -q -g ceph._@1 > dev/osd1/current/0.3_head/benchmark\\udata\\urskikr.localdomain\\u16952\\uobject99__head_2969453B__0 >>> /tmp/attr1 > > Note the ">>" on the second command to append the output, not > overwrite. Do this for each ceph._@X attribute. > > Then display the file as an object_info_t structure and check the size value. > > $ bin/ceph-dencoder type object_info_t import /tmp/attr1 decode dump_json > { > "oid": { > "oid": "benchmark_data_rskikr.localdomain_16952_object99", > "key": "", > "snapid": -2, > "hash": 694764859, > "max": 0, > "pool": 0, > "namespace": "" > }, > "version": "9'19", > "prior_version": "0'0", > "last_reqid": "client.4110.0:100", > "user_version": 19, > "size": 4194304, > "mtime": "2016-12-23 19:13:57.012681", > "local_mtime": "2016-12-23 19:13:57.032306", > "lost": 0, > "flags": 52, > "snaps": [], > "truncate_seq": 0, > "truncate_size": 0, > "data_digest": 2293522445, > "omap_digest": 4294967295, > "expected_object_size": 4194304, > "expected_write_size": 4194304, > "alloc_hint_flags": 53, > "watchers": {} > } > > Depending on the output one method for fixing this may be to use a > binary editing technique such a laid out in > https://www.spinics.net/lists/ceph-devel/msg16519.html to set the size > value to zero. Your target value is 1c0000. > > $ printf '%x\n' 1835008 > 1c0000 > > Make sure you check it is right before injecting it back in with "attr -s" > > What version is this? Did you look for a similar bug on the tracker? > > HTH. > > > -- > Cheers, > Brad > > On Fri, Dec 23, 2016 at 4:27 PM, Shinobu Kinjo <skinjo@xxxxxxxxxx> wrote: >> Would you be able to execute ``ceph pg ${PG ID} query`` against that >> particular PG? >> >> On Wed, Dec 21, 2016 at 11:44 PM, Andras Pataki >> <apataki@xxxxxxxxxxxxxxxxxxxx> wrote: >>> Yes, size = 3, and I have checked that all three replicas are the same zero >>> length object on the disk. I think some metadata info is mismatching what >>> the OSD log refers to as "object info size". But I'm not sure what to do >>> about it. pg repair does not fix it. In fact, the file this object >>> corresponds to in CephFS is shorter so this chunk shouldn't even exist I >>> think (details are in the original email). Although I may be understanding >>> the situation wrong ... >>> >>> Andras >>> >>> >>> On 12/21/2016 07:17 AM, Mehmet wrote: >>> >>> Hi Andras, >>> >>> Iam not the experienced User but i guess you could have a look on this >>> object on each related osd for the pg, compare them and delete the Different >>> object. I assume you have size = 3. >>> >>> Then again pg repair. >>> >>> But be carefull iirc the replica will be recovered from the primary pg. >>> >>> Hth >>> >>> Am 20. Dezember 2016 22:39:44 MEZ, schrieb Andras Pataki >>> <apataki@xxxxxxxxxxxxxxxxxxxx>: >>>> >>>> Hi cephers, >>>> >>>> Any ideas on how to proceed on the inconsistencies below? At the moment >>>> our ceph setup has 5 of these - in all cases it seems like some zero length >>>> objects that match across the three replicas, but do not match the object >>>> info size. I tried running pg repair on one of them, but it didn't repair >>>> the problem: >>>> >>>> 2016-12-20 16:24:40.870307 7f3e1a4b1700 0 log_channel(cluster) log [INF] >>>> : 6.92c repair starts >>>> 2016-12-20 16:27:06.183186 7f3e1a4b1700 -1 log_channel(cluster) log [ERR] >>>> : repair 6.92c 6:34932257:::1000187bbb5.00000009:head on disk size (0) does >>>> not match object info size (3014656) adjusted for ondisk to (3014656) >>>> 2016-12-20 16:27:35.885496 7f3e17cac700 -1 log_channel(cluster) log [ERR] >>>> : 6.92c repair 1 errors, 0 fixed >>>> >>>> >>>> Any help/hints would be appreciated. >>>> >>>> Thanks, >>>> >>>> Andras >>>> >>>> >>>> On 12/15/2016 10:13 AM, Andras Pataki wrote: >>>> >>>> Hi everyone, >>>> >>>> Yesterday scrubbing turned up an inconsistency in one of our placement >>>> groups. We are running ceph 10.2.3, using CephFS and RBD for some VM >>>> images. >>>> >>>> [root@hyperv017 ~]# ceph -s >>>> cluster d7b33135-0940-4e48-8aa6-1d2026597c2f >>>> health HEALTH_ERR >>>> 1 pgs inconsistent >>>> 1 scrub errors >>>> noout flag(s) set >>>> monmap e15: 3 mons at >>>> {hyperv029=10.4.36.179:6789/0,hyperv030=10.4.36.180:6789/0,hyperv031=10.4.36.181:6789/0} >>>> election epoch 27192, quorum 0,1,2 >>>> hyperv029,hyperv030,hyperv031 >>>> fsmap e17181: 1/1/1 up {0=hyperv029=up:active}, 2 up:standby >>>> osdmap e342930: 385 osds: 385 up, 385 in >>>> flags noout >>>> pgmap v37580512: 34816 pgs, 5 pools, 673 TB data, 198 Mobjects >>>> 1583 TB used, 840 TB / 2423 TB avail >>>> 34809 active+clean >>>> 4 active+clean+scrubbing+deep >>>> 2 active+clean+scrubbing >>>> 1 active+clean+inconsistent >>>> client io 87543 kB/s rd, 671 MB/s wr, 23 op/s rd, 2846 op/s wr >>>> >>>> # ceph pg dump | grep inconsistent >>>> 6.13f1 4692 0 0 0 0 16057314767 3087 3087 >>>> active+clean+inconsistent 2016-12-14 16:49:48.391572 342929'41011 >>>> 342929:43966 [158,215,364] 158 [158,215,364] 158 342928'40540 >>>> 2016-12-14 16:49:48.391511 342928'40540 2016-12-14 16:49:48.391511 >>>> >>>> I tried a couple of other deep scrubs on pg 6.13f1 but got repeated >>>> errors. In the OSD logs: >>>> >>>> 2016-12-14 16:48:07.733291 7f3b56e3a700 -1 log_channel(cluster) log [ERR] >>>> : deep-scrub 6.13f1 6:8fc91b77:::1000187bb70.00000009:head on disk size (0) >>>> does not match object info size (1835008) adjusted for ondisk to (1835008) >>>> I looked at the objects on the 3 OSD's on their respective hosts and they >>>> are the same, zero length files: >>>> >>>> # cd ~ceph/osd/ceph-158/current/6.13f1_head >>>> # find . -name *1000187bb70* -ls >>>> 669738 0 -rw-r--r-- 1 ceph ceph 0 Dec 13 17:00 >>>> ./DIR_1/DIR_F/DIR_3/DIR_9/DIR_8/1000187bb70.00000009__head_EED893F1__6 >>>> >>>> # cd ~ceph/osd/ceph-215/current/6.13f1_head >>>> # find . -name *1000187bb70* -ls >>>> 539815647 0 -rw-r--r-- 1 ceph ceph 0 Dec 13 17:00 >>>> ./DIR_1/DIR_F/DIR_3/DIR_9/DIR_8/1000187bb70.00000009__head_EED893F1__6 >>>> >>>> # cd ~ceph/osd/ceph-364/current/6.13f1_head >>>> # find . -name *1000187bb70* -ls >>>> 1881432215 0 -rw-r--r-- 1 ceph ceph 0 Dec 13 17:00 >>>> ./DIR_1/DIR_F/DIR_3/DIR_9/DIR_8/1000187bb70.00000009__head_EED893F1__6 >>>> >>>> At the time of the write, there wasn't anything unusual going on as far as >>>> I can tell (no hardware/network issues, all processes were up, etc). >>>> >>>> This pool is a CephFS data pool, and the corresponding file (inode hex >>>> 1000187bb70, decimal 1099537300336) looks like this: >>>> >>>> # ls -li chr4.tags.tsv >>>> 1099537300336 -rw-r--r-- 1 xichen xichen 14469915 Dec 13 17:01 >>>> chr4.tags.tsv >>>> >>>> Reading the file is also ok (no errors, right number of bytes): >>>> # cat chr4.tags.tsv > /dev/null >>>> # wc chr4.tags.tsv >>>> 592251 2961255 14469915 chr4.tags.tsv >>>> >>>> We are using the standard 4MB block size for CephFS, and if I interpret >>>> this right, this is the 9th chunk, so there shouldn't be any data (or even a >>>> 9th chunk), since the file is only 14MB. Should I run pg repair on this? >>>> Any ideas on how this could come about? Any other recommendations? >>>> >>>> Thanks, >>>> >>>> Andras >>>> apataki@xxxxxxxxxxx >>>> >>>> >>>> ________________________________ >>>> ceph-users mailing list >>>> ceph-users@xxxxxxxxxxxxxx >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> >>> >>> _______________________________________________ >>> ceph-users mailing list >>> ceph-users@xxxxxxxxxxxxxx >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com