On 01/05/2015 01:39 PM, ivan babrou wrote: > On 5 January 2015 at 14:20, Christian Balzer <chibi@xxxxxxx> wrote: > >> On Mon, 5 Jan 2015 14:04:28 +0400 ivan babrou wrote: >> >>> Hi! >>> >>> I have a cluster with 106 osds and disk usage is varying from 166gb to >>> 316gb. Disk usage is highly correlated to number of pg per osd (no >>> surprise here). Is there a reason for ceph to allocate more pg on some >>> nodes? >>> >> In essence what Wido said, you're a bit low on PGs. >> >> Also given your current utilization, pool 14 is totally oversize with 1024 >> PGs. You might want to re-create it with a smaller size and double pool 0 >> to 512 PGs and 10 to 4096. >> I assume you did raise the PGPs as well when changing the PGs, right? >> > > Yep, pg = pgp for all pools. Pool 14 is just for testing purposes, it might > get large eventually. > > I followed you advice in doubling pools 0 and 10. It is rebalancing at 30% > degraded now, but so far big osds become bigger and small become smaller: > http://i.imgur.com/hJcX9Us.png. I hope that trend would change before > rebalancing is complete. > > >> And yeah, CEPH isn't particular good at balancing stuff by itself, but >> with sufficient PGs you ought to get the variance below/around 30%. >> > > Is this going to change in the future releases? > Some things might change, but keep in mind that balancing happens based on object names and not sizes. Sizes would be impossible since those are dynamic. Wido > >> Christian >> >>> The biggest osds are 30, 42 and 69 (300gb+ each) and the smallest are 87, >>> 33 and 55 (170gb each). The biggest pool has 2048 pgs, pools with very >>> little data has only 8 pgs. PG size in biggest pool is ~6gb (5.1..6.3 >>> actually). >>> >>> Lack of balanced disk usage prevents me from using all the disk space. >>> When the biggest osd is full, cluster does not accept writes anymore. >>> >>> Here's gist with info about my cluster: >>> https://gist.github.com/bobrik/fb8ad1d7c38de0ff35ae >>> >> >> >> -- >> Christian Balzer Network/Systems Engineer >> chibi@xxxxxxx Global OnLine Japan/Fusion Communications >> http://www.gol.com/ >> > > > > > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Wido den Hollander 42on B.V. Ceph trainer and consultant Phone: +31 (0)20 700 9902 Skype: contact42on _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com