Ah, the CRUSH tunables basically don't impact placement at all unless CRUSH fails to do a placement for some reason. What you're seeing here is the result of a pseudo-random imbalance. Increasing your PG and pgp_num counts on the data pool should resolve it (though at the cost of some data movement which you'll need to be prepared for). -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Wed, Nov 13, 2013 at 1:00 PM, Oliver Schulz <oschulz@xxxxxxxxxx> wrote: > Dear Greg, > >> I believe 3.8 is after CRUSH_TUNABLES v1 was implemented in the >> kernel, so it shouldn't hurt you to turn them on if you need them. >> (And the crush tool is just out of date; we should update that text!) >> However, if you aren't having distribution issues on your cluster I >> wouldn't bother [...] > > > that's just the thing: > > Our cluster is now about 75% full and "ceph status" shows: > "HEALTH_WARN 1 near full osd(s)". > > The used space on the (identical) OSD partitions varies between the > extremes of 64% and 86% - I would have expected CRUSH to produce a > more balanced data placement. Is this to be expected? > > Our cluster structure: 6 Nodes (in 3 crushmap "racks" with 2 nodes each), 6x > 3TB disks per node, one OSD per disk - so 36 OSDs with 108 TB > in total. Nodes and drives are all identical and were taken into > operation at the same time, cluster hasn't changed since installation. > Disks have one big OSD data partition only, system and OSD journals are > on separate SSDs. Each OSD has a weight of 3.0 in the crushmap. > > We have the standard three data pools, set to 3x replication, plus a > (so far unused) pool "cheapdata" with 2x replication. Each pool has > 2368 PGs. Almost all of the data is in the data pool, then some in > rdb and a little in metadata (cheapdata being empty for now). > > When I look at the used space on the /var/lib/ceph/osd/ceph-XX > partitions on the nodes, I get the following: > > Node 1: 75%, 76%, 80%, 86%, 67%, 73% > Node 2: 71%, 75%, 76%, 82%, 74%, 76% > Node 3: 71%, 76%, 75%, 70%, 75%, 70% > Node 4: 76%, 83%, 66%, 68%, 72%, 78% > Node 5: 80%, 70%, 78%, 71%, 72%, 77% > Node 6: 81%, 74%, 69%, 67%, 78%, 64% > > Is this normal, or might there be an issue with our configuration > (no special things in it, though)? Might the tuning options help? > > I'd be very grateful for any advice! :-) > > > Cheers, > > Oliver > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com