Re: PG distribution scattered

Niklas Goerke <ceph-users@xxxxxxxxxxxxxxx> · Fri, 27 Sep 2013 13:28:43 +0200

Sorry for replying only now, I did not get to try it earlier…

On Thu, 19 Sep 2013 08:43:11 -0500, Mark Nelson wrote:
On 09/19/2013 08:36 AM, Niklas Goerke wrote:
[…]

My Setup:
* Two Hosts with 45 Disks each --> 90 OSDs
* Only one newly created pool with 4500 PGs and a Replica Size of 2 
-->
should be about 100 PGs per OSD

What I found was that one OSD only had 72 PGs, while another had 123 
PGs
[1]. That means that - if I did the math correctly - I can only fill 
the
cluster to about 81%, because thats when the first OSD is completely
full[2].

Does distribution improve if you make a pool with significantly more 
PGs?

Yes it does. I tried 45000 PGs and got a range of minimum 922 to a 
maximum of 1066 PGs per OSD (average is 1000). This is better, I can now 
fill my cluster up to 93,8% (theoretically) but I still don't get why I 
would want to limit myself to that. Also 1000 PGs are was to many for 
one OSD (I think 100 is suggested). What should I do about this?

I did some experimenting and found, that if I add another pool with 
4500
PGs, each OSD will have exacly doubled the amount of PGs as with one
pool. So this is not an accident (tried it multiple times). On 
another
test-cluster with 4 Hosts and 15 Disks each, the Distribution was
similarly worse.

This is a bug that causes each pool to more or less be distributed
the same way on the same hosts.  We have a fix, but it impacts
backwards compatibility so it's off by default.  If you set:

osd pool default flag hashpspool = true

Theoretically that will cause different pools to be distributed more
randomly.

I did not try this, becuase in my production scenario we will probably 
only have one or two very large pools, so it does not matter all that 
much to me.

[…]

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com