Re: Scrub stuck and 'pg has invalid (post-split) stat'

Cedric <yipikai7@xxxxxxxxx> · Fri, 1 Mar 2024 13:31:17 +0100

Not really, as unfortunately the cache eviction fails for some rbd
objects that still hace some "lock", right now we need to understand
why the eviction fails on these objects, and find a solution to have
the cache eviction fully working. I will provide more information
later on.

If you have any pointers, well they will be greatly appreciated.

Cheers

On Wed, Feb 28, 2024 at 9:50 PM Eugen Block <eblock@xxxxxx> wrote:
>
> Hi,
>
> great that you found a solution. Maybe that also helps to get rid of
> the cache-tier entirely?
>
> Zitat von Cedric <yipikai7@xxxxxxxxx>:
>
> > Hello,
> >
> > Sorry for the late reply, so yes we finally find a solution, which
> > was to split apart the cache_pool on dedicated OSD. It had the
> > effect to clear off slow ops and allow the cluster to serves clients
> > again, after 5 days of lock down, hopefully the majority of VM
> > resume well, thanks to the virtio driver that does not seems to have
> > any timeout.
> >
> > It seems that at least one of the main culprit was to store both
> > cold and hot data pool on same OSD (which in the end totally make
> > sens), maybe some others actions engaged also had an effect, we are
> > still trying to trouble shoot the root of slow ops, weirdly it was
> > the 5th cluster upgraded and all as almost the same configuration,
> > but this one handles 5x time more workload.
> >
> > In the hope it could help.
> >
> > Cédric
> >
> >> On 26 Feb 2024, at 10:57, Eugen Block <eblock@xxxxxx> wrote:
> >>
> >> Hi,
> >>
> >> thanks for the context. Was there any progress over the weekend?
> >> The hanging commands seem to be MGR related, and there's only one
> >> in your cluster according to your output. Can you deploy a second
> >> one manually, then adopt it with cephadm? Can you add 'ceph
> >> versions' as well?
> >>
> >>
> >> Zitat von florian.leduc@xxxxxxxxxx:
> >>
> >>> Hi,
> >>> A bit of history might help to understand why we have the cache tier.
> >>>
> >>> We run openstack on top ceph since many years now (started with
> >>> mimic, then an upgrade to nautilus (years 2 ago) and today and
> >>> upgrade to pacific). At the beginning of the setup, we used to
> >>> have a mix of hdd+ssd devices in HCI mode for openstack nova.
> >>> After the upgrade to nautilus, we made a hardware refresh with
> >>> brand new NVME devices. And transitionned from mixed devices to
> >>> nvme. But we were never able to evict all the data from the
> >>> vms_cache pools (even with being aggressive with the eviction; the
> >>> last resort would have been to stop all the virtual instances, and
> >>> that was not an option for our customers), so we decided to move
> >>> on and set cache-mode proxy and serve data with only nvme since
> >>> then. And it's been like this for 1 years and a half.
> >>>
> >>> But today, after the upgrade, the situation is that we cannot
> >>> query any stats (with ceph pg x.x query), rados query hangs, scrub
> >>> hangs even though all PGs are "active+clean". and there is no
> >>> client activity reported by the cluster. Recovery, and rebalance.
> >>> Also some other commands hangs, ie: "ceph balancer status".
> >>>
> >>> --------------
> >>> bash-4.2$ ceph -s
> >>>  cluster:
> >>>    id:     <fsid>
> >>>    health: HEALTH_WARN
> >>>            mon is allowing insecure global_id reclaim
> >>>            noscrub,nodeep-scrub,nosnaptrim flag(s) set
> >>>            18432 slow ops, oldest one blocked for 7626 sec,
> >>> daemons
> >>> [osd.0,osd.1,osd.10,osd.11,osd.112,osd.113,osd.118,osd.119,osd.120,osd.122]... have slow
> >>> ops.
> >>>
> >>>  services:
> >>>    mon: 3 daemons, quorum mon1,mon2,mon3(age 36m)
> >>>    mgr: bm9612541(active, since 39m)
> >>>    osd: 72 osds: 72 up (since 97m), 72 in (since 9h)
> >>>         flags noscrub,nodeep-scrub,nosnaptrim
> >>>
> >>>  data:
> >>>    pools:   8 pools, 2409 pgs
> >>>    objects: 14.64M objects, 92 TiB
> >>>    usage:   276 TiB used, 143 TiB / 419 TiB avail
> >>>    pgs:     2409 active+clean
> >>> _______________________________________________
> >>> ceph-users mailing list -- ceph-users@xxxxxxx
> >>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >>
> >>
> >> _______________________________________________
> >> ceph-users mailing list -- ceph-users@xxxxxxx
> >> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
>
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx