Re: pg's stuck activating on osd create

Dan van der Ster <dan.vanderster@xxxxxxxxx> · Thu, 18 Jul 2024 08:55:59 -0700

Hi Richard,

See here for an example of what the OSD logs show in case of this "PG
overdose protection". https://tracker.ceph.com/issues/65749

Cheers, dan

--
Dan van der Ster
CTO

Clyso GmbH
p: +49 89 215252722 | a: Vancouver, Canada
w: https://clyso.com | e: dan.vanderster@xxxxxxxxx

On Wed, Jun 26, 2024 at 5:03 PM Richard Bade <hitrich@xxxxxxxxx> wrote:
>
> Hi Everyone,
> I had an issue last night when I was bringing online some osds that I
> was rebuilding. When the osds created and came online 15pgs got stuck
> in activating. The first osd (osd.112) seemed to come online ok, but
> the second one (osd.113) triggered the issue. All the pgs in
> activating included osd.112 in the pg map and I resolved it by doing
> pg-upmap-items to map the pg back from osd.112 to where it currently
> was but it was painful having 10min of stuck i/o os an rbd pool with
> vm's running.
>
> Some details about the cluster:
> Pacific 16.2.15, upgraded from Nautilus fairly recently and Luminos
> back in the past. All osds were rebuilt on bluestore in Nautilus, as
> were the mons.
> The disks in question are Intel DC P4510 8TB nvme. I'm rebuilding them
> as I had previously had 4x2TB osd's per disk and now wanted to
> consolidate down to one osd per disk.
> There's around 300 osd's in the pool with 16384 pgs which means that
> the 2TB osds had 157pgs on them. However this means that the 8TB osds
> have 615pgs on them and I'm wondering if this is maybe the cause of
> the problem.
>
> There are no warnings about too many pgs per osd in the logs or ceph status.
> I have the default value of 250 for mon_max_pg_per_osd and default
> value of 3.0 for osd_max_pg_per_osd_hard_ratio.
>
> My plan is to reduce the number of pgs in the pool but I want to
> understand and prove what happened here.
> Is it likely I've hit pg overdose protection? If I have, how would I
> tell as I can't see anything in the cluster logs.
>
> Thanks,
> Rich
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx