Re: Identify laggy PGs

"Anthony D'Atri" <aad@xxxxxxxxxxxxxx> · Sat, 17 Aug 2024 09:23:20 -0400

> 
> I always thought that too many PGs have impact on the disk IO. I guess this is wrong?

Mostly when they’re spinners.  Especially back in the Filestore days with a colocated journal.  Don’t get me started on that.   

Too many PGs can exhaust RAM if you’re tight - or using Filestore still.  

For a SATA SSD I’d set pg_nums to average 200-300 per drive.  Your size mix complicates, though, because the larger OSDs will get many more than the smaller.   Be sure to set mon_max_pg_per_osd to like 1000.  

You might be experiment with primary affinity, so that the smaller OSDs are more likely to be primaries and thus will get more load.  I’ve seen a first-order approximation here increase read throughput by 20%

To 
> So I could double the PGs in the pool and see if things become better.
> 
> And yes, removing that single OSD from the cluster stopped the flapping of "monitor marked osd.N down".
> 
>> Am 15.08.2024 um 10:14 schrieb Frank Schilder <frans@xxxxxx>:
>> 
>> The current ceph recommendation is to use between 100-200 PGs/OSD. Therefore, a large PG is a PG that has more data than 0.5-1% of the disk capacity and you should split PGs for the relevant pool.
>> 
>> A huge PG is a PG for which deep-scrub takes much longer than 20min on HDD and 4-5min on SSD.
>> 
>> Average deep-scrub times (time it takes to deep-scrub) are actually a very good way of judging if PGs are too large. These times roughly correlate with the time it takes to copy a PG.
>> 
>> On SSDs we aim for 200+PGs/OSD and for HDDs for 150PGs/OSD. For very large HDD disks (>=16TB) we consider raising this to 300PGs/OSD due to excessively long deep-scrub times per PG.
>> 
>> Best regards,
>> =================
>> Frank Schilder
>> AIT Risø Campus
>> Bygning 109, rum S14
>> 
>> ________________________________________
>> From: Szabo, Istvan (Agoda) <Istvan.Szabo@xxxxxxxxx>
>> Sent: Wednesday, August 14, 2024 12:00 PM
>> To: Eugen Block; ceph-users@xxxxxxx
>> Subject:  Re: Identify laggy PGs
>> 
>> Just curiously I've checked my pg size which is like 150GB, when are we talking about big pgs?
>> ________________________________
>> From: Eugen Block <eblock@xxxxxx>
>> Sent: Wednesday, August 14, 2024 2:23 PM
>> To: ceph-users@xxxxxxx <ceph-users@xxxxxxx>
>> Subject:  Re: Identify laggy PGs
>> 
>> Email received from the internet. If in doubt, don't click any link nor open any attachment !
>> ________________________________
>> 
>> Hi,
>> 
>> how big are those PGs? If they're huge and are deep-scrubbed, for
>> example, that can cause significant delays. I usually look at 'ceph pg
>> ls-by-pool {pool}' and the "BYTES" column.
>> 
>> Zitat von Boris <bb@xxxxxxxxx>:
>> 
>>> Hi,
>>> 
>>> currently we encouter laggy PGs and I would like to find out what is
>>> causing it.
>>> I suspect it might be one or more failing OSDs. We had flapping OSDs and I
>>> synced one out, which helped with the flapping, but it doesn't help with
>>> the laggy ones.
>>> 
>>> Any tooling to identify or count PG performance and map that to OSDs?
>>> 
>>> 
>>> --
>>> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
>>> groÃƒ¼en Saal.
>>> _______________________________________________
>>> ceph-users mailing list -- ceph-users@xxxxxxx
>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>> 
>> 
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@xxxxxxx
>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>> 
>> ________________________________
>> This message is confidential and is for the sole use of the intended recipient(s). It may also be privileged or otherwise protected by copyright or other legal rules. If you have received it by mistake please let us know by reply email and delete it from your system. It is prohibited to copy this message or disclose its content to anyone. Any confidentiality or privilege is not waived or lost by any mistaken delivery or unauthorized disclosure of the message. All messages sent to and from Agoda may be monitored to ensure compliance with company policies, to protect the company's interests and to remove potential malware. Electronic messages may be intercepted, amended, lost or deleted, or contain viruses.
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@xxxxxxx
>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@xxxxxxx
>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx