Re: How to force PG merging in one step?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Frank,

Is this not checked per OSD? This would be really bad, because if it just uses the average (currently 143.3) this warning will never be triggered in critical situations.

I believe you're right, I can only remember having warnings about the average pg count per OSD, not the absolute value. I'm not aware if this is being worked on, maybe a tracker issue would be helpful here.

Regards,
Eugen

Zitat von Frank Schilder <frans@xxxxxx>:

Hi Eugen,

the PG merge finished and I still observe that no PG warning shows up. We have

mgr advanced mon_max_pg_per_osd 300

and I have an OSD with 306 PGs. Still, no warning:

# ceph health detail
HEALTH_OK

Is this not checked per OSD? This would be really bad, because if it just uses the average (currently 143.3) this warning will never be triggered in critical situations. Our average PG count is dominated by a huge HDD pool with ca. 120 PGs/OSD. We have a number of smaller SSD pools where we go closer to the limit. The critical pool has 24 OSDs, and I would have to create thousands of PGs on these OSDs before the average crosses the threshold. In other words, if its the average PG count that is used, this warning is almost always too late, because its the small pools where too many PGs end up on OSDs, but these don't influence the average much.

I was always wondering how users ended up with more than 1000 PGs per OSD by accident during recovery. It now makes more sense. If there is no per-OSD warning, this can easily happen.

Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

________________________________________
From: Frank Schilder <frans@xxxxxx>
Sent: 12 October 2022 17:11:02
To: Eugen Block
Cc: ceph-users@xxxxxxx
Subject:  Re: How to force PG merging in one step?

Hi Eugen.

During recovery there's another factor involved
(osd_max_pg_per_osd_hard_ratio), the default is 3. I had to deal with
that a few months back when I got inactive PGs due to many chunks and
"only" a factor of 3. In that specific cluster I increased it to 5 and
didn't encounter inactive PGs anymore.

Yes, I looked at this as well and I remember cases where people got stuck with temporary PG numbers being too high. This is precisely why I wanted to see this warning. If its off during recovery, the only way to notice that something is going wrong is when you hit the hard limit. But then its too late.

I actually wanted to see this during recovery to have an early warning sign. I purposefully did not increase pg_num_max to 500 to make sure that warning shows up. I personally consider it really bad behaviour if recovery/rebalancing disables this warning. Recovery is the operation where exceeding a PG limit limit without knowing will hurt most.

Thanks for the heads up. Probably need to watch my * a bit more with certain things.

Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux