Hi, I would say yes but it would be nice if other people can confirm it too. also can you create a test cluster and do the same tasks * create it with octopus * create snapshot * reduce rank to 1 * upgrade to pacific and then try to fix the PG, assuming that you will have the same issues in your test-cluster, cheers, Ansgar Am Do., 23. Juni 2022 um 22:12 Uhr schrieb Pascal Ehlert <pascal@xxxxxxxxxxxx>: > > Hi, > > I have now tried to "ceph osd pool rmsnap $POOL beforefixes" and it says the snapshot could not be found although I have definitely run "ceph osd pool mksnap $POOL beforefixes" about three weeks ago. > When running rados list-inconsistent-obj $PG on one of the affected PGs, all of the objects returned have "snap" set to 1: > > root@srv01:~# for i in $(rados list-inconsistent-pg $POOL | jq -er .[]); do rados list-inconsistent-obj $i | jq -er .inconsistents[].object; done > [..] > { > "name": "200020744f4.00000000", > "nspace": "", > "locator": "", > "snap": 1, > "version": 5704208 > } > { > "name": "200021aeb16.00000000", > "nspace": "", > "locator": "", > "snap": 1, > "version": 6189078 > } > [..] > > Running listsnaps on any of them then looks like this: > > root@srv01:~# rados listsnaps 200020744f4.00000000 -p $POOL > 200020744f4.00000000: > cloneid snaps size overlap > 1 1 0 [] > head - 0 > > > Is it save to assume that these objects belong to a somewhat broken snapshot and can be removed safely without causing further damage? > > > Thanks, > > Pascal > > > > Ansgar Jazdzewski wrote on 23.06.22 20:36: > > Hi, > > we could identify the rbd images that wehre affected and did an export before, but in the case of cephfs metadata i have no plan that will work. > > can you try to delete the snapshot? > also if the filesystem can be shutdown? try to do a backup of the metadatapool > > hope you will have some luck, let me know if I can help, > Ansgar > > Pascal Ehlert <pascal@xxxxxxxxxxxx> schrieb am Do., 23. Juni 2022, 16:45: >> >> Hi Ansgar, >> >> Thank you very much for the response. >> Running your first command to obtain inconsistent objects, I retrieve a >> total of 23114 only some of which are snaps. >> >> You mentioning snapshots did remind me of the fact however that I >> created a snapshot on the Ceph metadata pool via "ceph osd pool $POOL >> mksnap" before I reduced the number of ranks. >> Maybe that has causes the inconsistencies and would explain why the >> actual file system appears unaffected? >> >> Is there any way to validate that theory? I am a bit hesitant to just >> run "rmsnap". Could that cause inconsistent data to be written back to >> the actual objects? >> >> >> Best regards, >> >> Pascal >> >> >> >> Ansgar Jazdzewski wrote on 23.06.22 16:11: >> > Hi Pascal, >> > >> > We just had a similar situation on our RBD and had found some bad data >> > in RADOS here is How we did it: >> > >> > for i in $(rados list-inconsistent-pg $POOL | jq -er .[]); do rados >> > list-inconsistent-obj $i | jq -er .inconsistents[].object.name| awk >> > -F'.' '{print $2}'; done >> > >> > we than found inconsistent snaps on the Object: >> > >> > rados list-inconsistent-snapset $PG --format=json-pretty | jq >> > .inconsistents[].name >> > >> > List the data on the OSD's (ceph pg map $PG) >> > >> > ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-${OSD}/ --op >> > list ${OBJ} --pgid ${PG} >> > >> > and finally remove the object, like: >> > ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-459/ --op >> > list rbd_data.762a94d768c04d.000000000036b7ac --pgid >> > 2.704ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-459/ >> > '["2.704",{"oid":"rbd_data.801e1d1d9c719d.0000000000044943","key":"","snapid":125458,"hash":4136961796,"max":0,"pool":2,"namespace":"","max":0}]' >> > remove >> > >> > we had to do it for all OSD one after the other after this a 'pg repair' worked >> > >> > i hope it will help >> > Ansgar >> > >> > Am Do., 23. Juni 2022 um 15:02 Uhr schrieb Dan van der Ster >> > <dvanders@xxxxxxxxx>: >> >> Hi Pascal, >> >> >> >> It's not clear to me how the upgrade procedure you described would >> >> lead to inconsistent PGs. >> >> >> >> Even if you didn't record every step, do you have the ceph.log, the >> >> mds logs, perhaps some osd logs from this time? >> >> And which versions did you upgrade from / to ? >> >> >> >> Cheers, Dan >> >> >> >> On Wed, Jun 22, 2022 at 7:41 PM Pascal Ehlert <pascal@xxxxxxxxxxxx> wrote: >> >>> Hi all, >> >>> >> >>> I am currently battling inconsistent PGs after a far-reaching mistake >> >>> during the upgrade from Octopus to Pacific. >> >>> While otherwise following the guide, I restarted the Ceph MDS daemons >> >>> (and this started the Pacific daemons) without previously reducing the >> >>> ranks to 1 (from 2). >> >>> >> >>> This resulted in daemons not coming up and reporting inconsistencies. >> >>> After later reducing the ranks and bringing the MDS back up (I did not >> >>> record every step as this was an emergency situation), we started seeing >> >>> health errors on every scrub. >> >>> >> >>> Now after three weeks, while our CephFS is still working fine and we >> >>> haven't noticed any data damage, we realized that every single PG of the >> >>> cephfs metadata pool is affected. >> >>> Below you can find some information on the actual status and a detailed >> >>> inspection of one of the affected pgs. I am happy to provide any other >> >>> information that could be useful of course. >> >>> >> >>> A repair of the affected PGs does not resolve the issue. >> >>> Does anyone else here have an idea what we could try apart from copying >> >>> all the data to a new CephFS pool? >> >>> >> >>> >> >>> >> >>> Thank you! >> >>> >> >>> Pascal >> >>> >> >>> >> >>> >> >>> >> >>> root@srv02:~# ceph status >> >>> cluster: >> >>> id: f0d6d4d0-8c17-471a-9f95-ebc80f1fee78 >> >>> health: HEALTH_ERR >> >>> insufficient standby MDS daemons available >> >>> 69262 scrub errors >> >>> Too many repaired reads on 2 OSDs >> >>> Possible data damage: 64 pgs inconsistent >> >>> >> >>> services: >> >>> mon: 3 daemons, quorum srv02,srv03,srv01 (age 3w) >> >>> mgr: srv03(active, since 3w), standbys: srv01, srv02 >> >>> mds: 2/2 daemons up, 1 hot standby >> >>> osd: 44 osds: 44 up (since 3w), 44 in (since 10M) >> >>> >> >>> data: >> >>> volumes: 2/2 healthy >> >>> pools: 13 pools, 1217 pgs >> >>> objects: 75.72M objects, 26 TiB >> >>> usage: 80 TiB used, 42 TiB / 122 TiB avail >> >>> pgs: 1153 active+clean >> >>> 55 active+clean+inconsistent >> >>> 9 active+clean+inconsistent+failed_repair >> >>> >> >>> io: >> >>> client: 2.0 MiB/s rd, 21 MiB/s wr, 240 op/s rd, 1.75k op/s wr >> >>> >> >>> >> >>> { >> >>> "epoch": 4962617, >> >>> "inconsistents": [ >> >>> { >> >>> "object": { >> >>> "name": "1000000cc8e.00000000", >> >>> "nspace": "", >> >>> "locator": "", >> >>> "snap": 1, >> >>> "version": 4253817 >> >>> }, >> >>> "errors": [], >> >>> "union_shard_errors": [ >> >>> "omap_digest_mismatch_info" >> >>> ], >> >>> "selected_object_info": { >> >>> "oid": { >> >>> "oid": "1000000cc8e.00000000", >> >>> "key": "", >> >>> "snapid": 1, >> >>> "hash": 1369745244, >> >>> "max": 0, >> >>> "pool": 7, >> >>> "namespace": "" >> >>> }, >> >>> "version": "4962847'6209730", >> >>> "prior_version": "3916665'4306116", >> >>> "last_reqid": "osd.27.0:757107407", >> >>> "user_version": 4253817, >> >>> "size": 0, >> >>> "mtime": "2022-02-26T12:56:55.612420+0100", >> >>> "local_mtime": "2022-02-26T12:56:55.614429+0100", >> >>> "lost": 0, >> >>> "flags": [ >> >>> "dirty", >> >>> "omap", >> >>> "data_digest", >> >>> "omap_digest" >> >>> ], >> >>> "truncate_seq": 0, >> >>> "truncate_size": 0, >> >>> "data_digest": "0xffffffff", >> >>> "omap_digest": "0xe5211a9e", >> >>> "expected_object_size": 0, >> >>> "expected_write_size": 0, >> >>> "alloc_hint_flags": 0, >> >>> "manifest": { >> >>> "type": 0 >> >>> }, >> >>> "watchers": {} >> >>> }, >> >>> "shards": [ >> >>> { >> >>> "osd": 20, >> >>> "primary": false, >> >>> "errors": [ >> >>> "omap_digest_mismatch_info" >> >>> ], >> >>> "size": 0, >> >>> "omap_digest": "0xffffffff", >> >>> "data_digest": "0xffffffff" >> >>> }, >> >>> { >> >>> "osd": 27, >> >>> "primary": true, >> >>> "errors": [ >> >>> "omap_digest_mismatch_info" >> >>> ], >> >>> "size": 0, >> >>> "omap_digest": "0xffffffff", >> >>> "data_digest": "0xffffffff" >> >>> }, >> >>> { >> >>> "osd": 43, >> >>> "primary": false, >> >>> "errors": [ >> >>> "omap_digest_mismatch_info" >> >>> ], >> >>> "size": 0, >> >>> "omap_digest": "0xffffffff", >> >>> "data_digest": "0xffffffff" >> >>> } >> >>> ] >> >>> }, >> >>> >> >>> >> >>> >> >>> >> >>> _______________________________________________ >> >>> ceph-users mailing list -- ceph-users@xxxxxxx >> >>> To unsubscribe send an email to ceph-users-leave@xxxxxxx >> >> _______________________________________________ >> >> ceph-users mailing list -- ceph-users@xxxxxxx >> >> To unsubscribe send an email to ceph-users-leave@xxxxxxx >> > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx