Re: Ceph PGs stuck inactive after rebuild node

"Szabo, Istvan (Agoda)" <Istvan.Szabo@xxxxxxxxx> · Tue, 3 May 2022 02:26:22 +0000

Just curious how much pg you are using /osd? 250?

Istvan Szabo
Senior Infrastructure Engineer
---------------------------------------------------
Agoda Services Co., Ltd.
e: istvan.szabo@xxxxxxxxx<mailto:istvan.szabo@xxxxxxxxx>
---------------------------------------------------

On 2022. May 2., at 18:11, Eugen Block <eblock@xxxxxx> wrote:

Email received from the internet. If in doubt, don't click any link nor open any attachment !
________________________________

Hi,

You have 2 or 4 osd/disk?

no such thing, only one OSD per HDD (with DB on SSD).

Zitat von "Szabo, Istvan (Agoda)" <Istvan.Szabo@xxxxxxxxx>:

You have 2 or 4 osd/disk?

Istvan Szabo
Senior Infrastructure Engineer
---------------------------------------------------
Agoda Services Co., Ltd.
e: istvan.szabo@xxxxxxxxx<mailto:istvan.szabo@xxxxxxxxx>
---------------------------------------------------

On 2022. May 2., at 15:59, Eugen Block <eblock@xxxxxx> wrote:

Email received from the internet. If in doubt, don't click any link
nor open any attachment !
________________________________

Just to update this thread, apparently you were right, we did hit the
limit of mon_max_pg_per_osd * osd_max_pg_per_osd_hard_ratio (250 * 3 =
750), this was found in the logs:

2022-04-06 14:24:55.256 7f8bb5a0e700  1 osd.8 43377
maybe_wait_for_max_pg withhold creation of pg 75.56s16: 750 >= 750

This message first came up for the last up OSD on that host after all
other OSDs were purged and then again for the first up after the
rebuild. I'm currently playing around with the osdmaptool, I have a
feeling that this could also be an issue in newer releases, but that
is just speculation at the moment.
As a workaround we'll increase osd_max_pg_per_osd_hard_ratio to 5 and
see how the next attempt will go.

Thanks,
Eugen

Zitat von Josh Baergen <jbaergen@xxxxxxxxxxxxxxxx>:

On Wed, Apr 6, 2022 at 11:20 AM Eugen Block <eblock@xxxxxx> wrote:
I'm pretty sure that their cluster isn't anywhere near the limit for
mon_max_pg_per_osd, they currently have around 100 PGs per OSD and the
configs have not been touched, it's pretty basic.

How is the host being "rebuilt"? Depending on the CRUSH rule, if the
host's OSDs are all marked destroyed and then re-created one at a time
with normal weight, CRUSH may decide to put a large number of PGs on
the first OSD that is created, and so on, until the rest of the host's
OSDs are available to take those OSDs.

Josh

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

________________________________
This message is confidential and is for the sole use of the intended
recipient(s). It may also be privileged or otherwise protected by
copyright or other legal rules. If you have received it by mistake
please let us know by reply email and delete it from your system. It
is prohibited to copy this message or disclose its content to
anyone. Any confidentiality or privilege is not waived or lost by
any mistaken delivery or unauthorized disclosure of the message. All
messages sent to and from Agoda may be monitored to ensure
compliance with company policies, to protect the company's interests
and to remove potential malware. Electronic messages may be
intercepted, amended, lost or deleted, or contain viruses.

________________________________
This message is confidential and is for the sole use of the intended recipient(s). It may also be privileged or otherwise protected by copyright or other legal rules. If you have received it by mistake please let us know by reply email and delete it from your system. It is prohibited to copy this message or disclose its content to anyone. Any confidentiality or privilege is not waived or lost by any mistaken delivery or unauthorized disclosure of the message. All messages sent to and from Agoda may be monitored to ensure compliance with company policies, to protect the company's interests and to remove potential malware. Electronic messages may be intercepted, amended, lost or deleted, or contain viruses.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx