Sadly I never discovered anything more. It ended up clearing up on its own, which was disconcerting, but I resigned to not making things worse in an attempt to make them better. I assume someone touched the file in CephFS, which triggered the metadata to be updated, and everyone was able to reach consensus. Wish I had more for you. Reed > On Jun 3, 2019, at 7:43 AM, Dan van der Ster <dan@xxxxxxxxxxxxxx> wrote: > > Hi Reed and Brad, > > Did you ever learn more about this problem? > We currently have a few inconsistencies arriving with the same env > (cephfs, v13.2.5) and symptoms. > > PG Repair doesn't fix the inconsistency, nor does Brad's omap > workaround earlier in the thread. > In our case, we can fix by cp'ing the file to a new inode, deleting > the inconsistent file, then scrubbing the PG. > > -- Dan > > > On Fri, May 3, 2019 at 3:18 PM Reed Dier <reed.dier@xxxxxxxxxxx> wrote: >> >> Just to follow up for the sake of the mailing list, >> >> I had not had a chance to attempt your steps yet, but things appear to have worked themselves out on their own. >> >> Both scrub errors cleared without intervention, and I'm not sure if it is the results of that object getting touched in CephFS that triggered the update of the size info, or if something else was able to clear it. >> >> Didn't see anything relating to the clearing in mon, mgr, or osd logs. >> >> So, not entirely sure what fixed it, but it is resolved on its own. >> >> Thanks, >> >> Reed >> >> On Apr 30, 2019, at 8:01 PM, Brad Hubbard <bhubbard@xxxxxxxxxx> wrote: >> >> On Wed, May 1, 2019 at 10:54 AM Brad Hubbard <bhubbard@xxxxxxxxxx> wrote: >> >> >> Which size is correct? >> >> >> Sorry, accidental discharge =D >> >> If the object info size is *incorrect* try forcing a write to the OI >> with something like the following. >> >> 1. rados -p [name_of_pool_17] setomapval 10008536718.00000000 >> temporary-key anything >> 2. ceph pg deep-scrub 17.2b9 >> 3. Wait for the scrub to finish >> 4. rados -p [name_of_pool_2] rmomapkey 10008536718.00000000 temporary-key >> >> If the object info size is *correct* you could try just doing a rados >> get followed by a rados put of the object to see if the size is >> updated correctly. >> >> It's more likely the object info size is wrong IMHO. >> >> >> On Tue, Apr 30, 2019 at 1:06 AM Reed Dier <reed.dier@xxxxxxxxxxx> wrote: >> >> >> Hi list, >> >> Woke up this morning to two PG's reporting scrub errors, in a way that I haven't seen before. >> >> $ ceph versions >> { >> "mon": { >> "ceph version 13.2.5 (cbff874f9007f1869bfd3821b7e33b2a6ffd4988) mimic (stable)": 3 >> }, >> "mgr": { >> "ceph version 13.2.5 (cbff874f9007f1869bfd3821b7e33b2a6ffd4988) mimic (stable)": 3 >> }, >> "osd": { >> "ceph version 13.2.4 (b10be4d44915a4d78a8e06aa31919e74927b142e) mimic (stable)": 156 >> }, >> "mds": { >> "ceph version 13.2.5 (cbff874f9007f1869bfd3821b7e33b2a6ffd4988) mimic (stable)": 2 >> }, >> "overall": { >> "ceph version 13.2.4 (b10be4d44915a4d78a8e06aa31919e74927b142e) mimic (stable)": 156, >> "ceph version 13.2.5 (cbff874f9007f1869bfd3821b7e33b2a6ffd4988) mimic (stable)": 8 >> } >> } >> >> >> OSD_SCRUB_ERRORS 8 scrub errors >> PG_DAMAGED Possible data damage: 2 pgs inconsistent >> pg 17.72 is active+clean+inconsistent, acting [3,7,153] >> pg 17.2b9 is active+clean+inconsistent, acting [19,7,16] >> >> >> Here is what $rados list-inconsistent-obj 17.2b9 --format=json-pretty yields: >> >> { >> "epoch": 134582, >> "inconsistents": [ >> { >> "object": { >> "name": "10008536718.00000000", >> "nspace": "", >> "locator": "", >> "snap": "head", >> "version": 0 >> }, >> "errors": [], >> "union_shard_errors": [ >> "obj_size_info_mismatch" >> ], >> "shards": [ >> { >> "osd": 7, >> "primary": false, >> "errors": [ >> "obj_size_info_mismatch" >> ], >> "size": 5883, >> "object_info": { >> "oid": { >> "oid": "10008536718.00000000", >> "key": "", >> "snapid": -2, >> "hash": 1752643257, >> "max": 0, >> "pool": 17, >> "namespace": "" >> }, >> "version": "134599'448331", >> "prior_version": "134599'448330", >> "last_reqid": "client.1580931080.0:671854", >> "user_version": 448331, >> "size": 3505, >> "mtime": "2019-04-28 15:32:20.003519", >> "local_mtime": "2019-04-28 15:32:25.991015", >> "lost": 0, >> "flags": [ >> "dirty", >> "data_digest", >> "omap_digest" >> ], >> "truncate_seq": 899, >> "truncate_size": 0, >> "data_digest": "0xf99a3bd3", >> "omap_digest": "0xffffffff", >> "expected_object_size": 0, >> "expected_write_size": 0, >> "alloc_hint_flags": 0, >> "manifest": { >> "type": 0 >> }, >> "watchers": {} >> } >> }, >> { >> "osd": 16, >> "primary": false, >> "errors": [ >> "obj_size_info_mismatch" >> ], >> "size": 5883, >> "object_info": { >> "oid": { >> "oid": "10008536718.00000000", >> "key": "", >> "snapid": -2, >> "hash": 1752643257, >> "max": 0, >> "pool": 17, >> "namespace": "" >> }, >> "version": "134599'448331", >> "prior_version": "134599'448330", >> "last_reqid": "client.1580931080.0:671854", >> "user_version": 448331, >> "size": 3505, >> "mtime": "2019-04-28 15:32:20.003519", >> "local_mtime": "2019-04-28 15:32:25.991015", >> "lost": 0, >> "flags": [ >> "dirty", >> "data_digest", >> "omap_digest" >> ], >> "truncate_seq": 899, >> "truncate_size": 0, >> "data_digest": "0xf99a3bd3", >> "omap_digest": "0xffffffff", >> "expected_object_size": 0, >> "expected_write_size": 0, >> "alloc_hint_flags": 0, >> "manifest": { >> "type": 0 >> }, >> "watchers": {} >> } >> }, >> { >> "osd": 19, >> "primary": true, >> "errors": [ >> "obj_size_info_mismatch" >> ], >> "size": 5883, >> "object_info": { >> "oid": { >> "oid": "10008536718.00000000", >> "key": "", >> "snapid": -2, >> "hash": 1752643257, >> "max": 0, >> "pool": 17, >> "namespace": "" >> }, >> "version": "134599'448331", >> "prior_version": "134599'448330", >> "last_reqid": "client.1580931080.0:671854", >> "user_version": 448331, >> "size": 3505, >> "mtime": "2019-04-28 15:32:20.003519", >> "local_mtime": "2019-04-28 15:32:25.991015", >> "lost": 0, >> "flags": [ >> "dirty", >> "data_digest", >> "omap_digest" >> ], >> "truncate_seq": 899, >> "truncate_size": 0, >> "data_digest": "0xf99a3bd3", >> "omap_digest": "0xffffffff", >> "expected_object_size": 0, >> "expected_write_size": 0, >> "alloc_hint_flags": 0, >> "manifest": { >> "type": 0 >> }, >> "watchers": {} >> } >> } >> ] >> } >> ] >> } >> >> >> To snip that down to the parts that appear to matter: >> >> "errors": [], >> "union_shard_errors": [ >> "obj_size_info_mismatch" >> ], >> "shards": [ >> { >> "errors": [ >> "obj_size_info_mismatch" >> ], >> "size": 5883, >> "object_info": { >> "size": 3505, } >> >> >> It looks like the size info, does in fact mismatch (5883 != 3505). >> >> So I attempted a deep-scrub again, and the issue persists across both PG's. >> >> 2019-04-29 09:08:27.729 7fe4f5bee700 0 log_channel(cluster) log [DBG] : 17.2b9 deep-scrub starts >> 2019-04-29 09:22:53.363 7fe4f5bee700 -1 log_channel(cluster) log [ERR] : 17.2b9 shard 19 soid 17:9d6cee >> 16:::10008536718.00000000:head : candidate size 5883 info size 3505 mismatch >> 2019-04-29 09:22:53.363 7fe4f5bee700 -1 log_channel(cluster) log [ERR] : 17.2b9 shard 7 soid 17:9d6cee1 >> 6:::10008536718.00000000:head : candidate size 5883 info size 3505 mismatch >> 2019-04-29 09:22:53.363 7fe4f5bee700 -1 log_channel(cluster) log [ERR] : 17.2b9 shard 16 soid 17:9d6cee >> 16:::10008536718.00000000:head : candidate size 5883 info size 3505 mismatch >> 2019-04-29 09:22:53.363 7fe4f5bee700 -1 log_channel(cluster) log [ERR] : 17.2b9 soid 17:9d6cee16:::1000 >> 8536718.00000000:head : failed to pick suitable object info >> 2019-04-29 09:22:53.363 7fe4f5bee700 -1 log_channel(cluster) log [ERR] : deep-scrub 17.2b9 17:9d6cee16: >> ::10008536718.00000000:head : on disk size (5883) does not match object info size (3505) adjusted for o >> ndisk to (3505) >> 2019-04-29 09:27:46.840 7fe4f5bee700 -1 log_channel(cluster) log [ERR] : 17.2b9 deep-scrub 4 errors >> >> >> Pool 17 is a cephfs data pool, if that makes any difference. >> And the two MDS's listed in versions are active:standby, not active:active. >> >> My question is whether I should attempt a `ceph pg repair <pgid>` to attempt a fix of these objects, or take another approach, as the object size mismatch appears to persist across all 3 copies of the PG(s). >> I know that ceph pg repair can be dangerous in certain circumstances, so I want to feel confident in the operation before undertaking the repair. >> >> I did look at all underlying disks for these PG's for issues or errors, and none bubbled to the top, so I don't believe it to be a hardware issue in this case. >> >> Appreciate any help. >> >> Thanks, >> >> Reed >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >> >> >> >> -- >> Cheers, >> Brad >> >> >> >> >> -- >> Cheers, >> Brad >> >> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com