Re: PG Balancer Upmap mode not working

Lars Täuber <taeuber@xxxxxxx> · Tue, 10 Dec 2019 07:54:35 +0100

Hi Anthony!

Mon, 9 Dec 2019 17:11:12 -0800
Anthony D'Atri <aad@xxxxxxxxxxxxxx> ==> ceph-users <ceph-users@xxxxxxxxxxxxxx> :
> > How is that possible? I dont know how much more proof I need to present that there's a bug.  
> 
> FWIW, your pastes are hard to read with all the ? in them.  Pasting non-7-bit-ASCII?

I don't see much "?" in his posts. Maybe a display issue?

> > |I increased PGs and see no difference.  
> 
> From what pgp_num to what new value?  Numbers that are not a power of 2 can contribute to the sort of problem you describe.  Do you have host CRUSH fault domain?
> 

Does the fault domain play a role with this situation? I can't see the reason. This would only be important if the OSDs weren't evenly distributed across the hosts.
Philippe can you posts your 'ceph osd tree'?

> > Raising PGs to 100 is an old statement anyway, anything 60+ should be fine.   
> 
> Fine in what regard?  To be sure, Wido’s advice means a *ratio* of at least 100.  ratio = (pgp_num * replication) / #osds
> 
> The target used to be 200, a commit around 12.2.1 retconned that to 100.  Best I can tell the rationale is memory usage at the expense of performance.
> 
> Is your original except complete? Ie., do you only have 24 OSDs?  Across how many nodes?
> 
> The old guidance for tiny clusters:
> 
> • Less than 5 OSDs set pg_num to 128
> 
> • Between 5 and 10 OSDs set pg_num to 512
> 
> • Between 10 and 50 OSDs set pg_num to 1024

This is what I thought too. But in this posts
https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/message/TR6CJQKSMOHNGOMQO4JBDMGEL2RMWE36/
[Why are the mailing lists ceph.io and ceph.com not merged? It's hard to find the link to messages this way.]
Konstantin suggested to reduce to pg_num=512. The cluster had 35 OSDs.
It is still merging very slowly the PGs.

In the meantime I added 5 more OSDs and thinking about rising the pg_num back to 1024.
I wonder how less PGs can balance better than 512.

I'm in a similar situation like Philippe with my cluster.
ceph osd df class hdd:
[…]
MIN/MAX VAR: 0.73/1.21  STDDEV: 6.27

Attached is a picture of the dashboard with tiny bars of the data distribution. The nearly empty OSDs are SSDs used for its own pool.
I think there might be a bug in the balancing algorithm.

Thanks,
Lars

Attachment:
ceph odd balancing of data.jpg

Description: JPEG image
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com