Re: Inconsistent PGs after upgrade to Pacific

Ansgar Jazdzewski <a.jazdzewski@xxxxxxxxxxxxxx> · Thu, 23 Jun 2022 20:36:45 +0200

Hi,

we could identify the rbd images that wehre affected and did an export
before, but in the case of cephfs metadata i have no plan that will work.

can you try to delete the snapshot?
also if the filesystem can be shutdown? try to do a backup of the
metadatapool

hope you will have some luck, let me know if I can help,
Ansgar

Pascal Ehlert <pascal@xxxxxxxxxxxx> schrieb am Do., 23. Juni 2022, 16:45:

> Hi Ansgar,
>
> Thank you very much for the response.
> Running your first command to obtain inconsistent objects, I retrieve a
> total of 23114 only some of which are snaps.
>
> You mentioning snapshots did remind me of the fact however that I
> created a snapshot on the Ceph metadata pool via "ceph osd pool $POOL
> mksnap" before I reduced the number of ranks.
> Maybe that has causes the inconsistencies and would explain why the
> actual file system appears unaffected?
>
> Is there any way to validate that theory? I am a bit hesitant to just
> run "rmsnap". Could that cause inconsistent data to be written back to
> the actual objects?
>
>
> Best regards,
>
> Pascal
>
>
>
> Ansgar Jazdzewski wrote on 23.06.22 16:11:
> > Hi Pascal,
> >
> > We just had a similar situation on our RBD and had found some bad data
> > in RADOS here is How we did it:
> >
> > for i in $(rados list-inconsistent-pg $POOL | jq -er .[]); do rados
> > list-inconsistent-obj $i | jq -er .inconsistents[].object.name| awk
> > -F'.' '{print $2}'; done
> >
> > we than found inconsistent snaps on the Object:
> >
> > rados list-inconsistent-snapset $PG --format=json-pretty | jq
> > .inconsistents[].name
> >
> > List the data on the OSD's (ceph pg map $PG)
> >
> > ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-${OSD}/ --op
> > list ${OBJ} --pgid ${PG}
> >
> > and finally remove the object, like:
> > ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-459/ --op
> > list rbd_data.762a94d768c04d.000000000036b7ac --pgid
> > 2.704ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-459/
> >
> '["2.704",{"oid":"rbd_data.801e1d1d9c719d.0000000000044943","key":"","snapid":125458,"hash":4136961796,"max":0,"pool":2,"namespace":"","max":0}]'
> > remove
> >
> > we had to do it for all OSD one after the other after this a 'pg repair'
> worked
> >
> > i hope it will help
> > Ansgar
> >
> > Am Do., 23. Juni 2022 um 15:02 Uhr schrieb Dan van der Ster
> > <dvanders@xxxxxxxxx>:
> >> Hi Pascal,
> >>
> >> It's not clear to me how the upgrade procedure you described would
> >> lead to inconsistent PGs.
> >>
> >> Even if you didn't record every step, do you have the ceph.log, the
> >> mds logs, perhaps some osd logs from this time?
> >> And which versions did you upgrade from / to ?
> >>
> >> Cheers, Dan
> >>
> >> On Wed, Jun 22, 2022 at 7:41 PM Pascal Ehlert <pascal@xxxxxxxxxxxx>
> wrote:
> >>> Hi all,
> >>>
> >>> I am currently battling inconsistent PGs after a far-reaching mistake
> >>> during the upgrade from Octopus to Pacific.
> >>> While otherwise following the guide, I restarted the Ceph MDS daemons
> >>> (and this started the Pacific daemons) without previously reducing the
> >>> ranks to 1 (from 2).
> >>>
> >>> This resulted in daemons not coming up and reporting inconsistencies.
> >>> After later reducing the ranks and bringing the MDS back up (I did not
> >>> record every step as this was an emergency situation), we started
> seeing
> >>> health errors on every scrub.
> >>>
> >>> Now after three weeks, while our CephFS is still working fine and we
> >>> haven't noticed any data damage, we realized that every single PG of
> the
> >>> cephfs metadata pool is affected.
> >>> Below you can find some information on the actual status and a detailed
> >>> inspection of one of the affected pgs. I am happy to provide any other
> >>> information that could be useful of course.
> >>>
> >>> A repair of the affected PGs does not resolve the issue.
> >>> Does anyone else here have an idea what we could try apart from copying
> >>> all the data to a new CephFS pool?
> >>>
> >>>
> >>>
> >>> Thank you!
> >>>
> >>> Pascal
> >>>
> >>>
> >>>
> >>>
> >>> root@srv02:~# ceph status
> >>>     cluster:
> >>>       id:     f0d6d4d0-8c17-471a-9f95-ebc80f1fee78
> >>>       health: HEALTH_ERR
> >>>               insufficient standby MDS daemons available
> >>>               69262 scrub errors
> >>>               Too many repaired reads on 2 OSDs
> >>>               Possible data damage: 64 pgs inconsistent
> >>>
> >>>     services:
> >>>       mon: 3 daemons, quorum srv02,srv03,srv01 (age 3w)
> >>>       mgr: srv03(active, since 3w), standbys: srv01, srv02
> >>>       mds: 2/2 daemons up, 1 hot standby
> >>>       osd: 44 osds: 44 up (since 3w), 44 in (since 10M)
> >>>
> >>>     data:
> >>>       volumes: 2/2 healthy
> >>>       pools:   13 pools, 1217 pgs
> >>>       objects: 75.72M objects, 26 TiB
> >>>       usage:   80 TiB used, 42 TiB / 122 TiB avail
> >>>       pgs:     1153 active+clean
> >>>                55   active+clean+inconsistent
> >>>                9    active+clean+inconsistent+failed_repair
> >>>
> >>>     io:
> >>>       client:   2.0 MiB/s rd, 21 MiB/s wr, 240 op/s rd, 1.75k op/s wr
> >>>
> >>>
> >>> {
> >>>     "epoch": 4962617,
> >>>     "inconsistents": [
> >>>       {
> >>>         "object": {
> >>>           "name": "1000000cc8e.00000000",
> >>>           "nspace": "",
> >>>           "locator": "",
> >>>           "snap": 1,
> >>>           "version": 4253817
> >>>         },
> >>>         "errors": [],
> >>>         "union_shard_errors": [
> >>>           "omap_digest_mismatch_info"
> >>>         ],
> >>>         "selected_object_info": {
> >>>           "oid": {
> >>>             "oid": "1000000cc8e.00000000",
> >>>             "key": "",
> >>>             "snapid": 1,
> >>>             "hash": 1369745244,
> >>>             "max": 0,
> >>>             "pool": 7,
> >>>             "namespace": ""
> >>>           },
> >>>           "version": "4962847'6209730",
> >>>           "prior_version": "3916665'4306116",
> >>>           "last_reqid": "osd.27.0:757107407",
> >>>           "user_version": 4253817,
> >>>           "size": 0,
> >>>           "mtime": "2022-02-26T12:56:55.612420+0100",
> >>>           "local_mtime": "2022-02-26T12:56:55.614429+0100",
> >>>           "lost": 0,
> >>>           "flags": [
> >>>             "dirty",
> >>>             "omap",
> >>>             "data_digest",
> >>>             "omap_digest"
> >>>           ],
> >>>           "truncate_seq": 0,
> >>>           "truncate_size": 0,
> >>>           "data_digest": "0xffffffff",
> >>>           "omap_digest": "0xe5211a9e",
> >>>           "expected_object_size": 0,
> >>>           "expected_write_size": 0,
> >>>           "alloc_hint_flags": 0,
> >>>           "manifest": {
> >>>             "type": 0
> >>>           },
> >>>           "watchers": {}
> >>>         },
> >>>         "shards": [
> >>>           {
> >>>             "osd": 20,
> >>>             "primary": false,
> >>>             "errors": [
> >>>               "omap_digest_mismatch_info"
> >>>             ],
> >>>             "size": 0,
> >>>             "omap_digest": "0xffffffff",
> >>>             "data_digest": "0xffffffff"
> >>>           },
> >>>           {
> >>>             "osd": 27,
> >>>             "primary": true,
> >>>             "errors": [
> >>>               "omap_digest_mismatch_info"
> >>>             ],
> >>>             "size": 0,
> >>>             "omap_digest": "0xffffffff",
> >>>             "data_digest": "0xffffffff"
> >>>           },
> >>>           {
> >>>             "osd": 43,
> >>>             "primary": false,
> >>>             "errors": [
> >>>               "omap_digest_mismatch_info"
> >>>             ],
> >>>             "size": 0,
> >>>             "omap_digest": "0xffffffff",
> >>>             "data_digest": "0xffffffff"
> >>>           }
> >>>         ]
> >>>       },
> >>>
> >>>
> >>>
> >>>
> >>> _______________________________________________
> >>> ceph-users mailing list -- ceph-users@xxxxxxx
> >>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >> _______________________________________________
> >> ceph-users mailing list -- ceph-users@xxxxxxx
> >> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx