no rebalance when changing chooseleaf_vary_r tunable

Adrian <aussieade@xxxxxxxxx> · Thu, 5 Apr 2018 08:49:56 +1000

Hi all,

Was wondering if someone could enlighten me...

I've recently been upgrading a small test clusters tunables from bobtail to
firefly prior to doing the same with an old production cluster.

OS is rhel 7.4, kernel in test is all 3.10.0-693.el7.x86_64, in prod admin box
is 3.10.0-693.el7.x86_64 all mons and osds are 4.4.76-1.el7.elrepo.x86_64

Ceph version is 0.94.10-0.el7, both were installed with ceph-deploy 5.37-0

Production system was originally redhat ceph but was then changed to
ceph-community edition (all prior to me being here) and has 189 osds
on 21 hosts with 5 mons

I changed chooseleaf_vary_r from 0 in test incrementally from 5 to 1,
each change saw a larger rebalance than the last

   |-------------------+----------+-----------|
   | chooseleaf_vary_r | degraded | misplaced |
   |-------------------+----------+-----------|
   |                 5 |       0% |    0.187% |
   |                 4 |   1.913% |    2.918% |
   |                 3 |   6.965% |   18.904% |
   |                 2 |  14.303% |   32.380% |
   |                 1 |  20.657% |   48.310% |
   |-------------------+----------+-----------|

As the change to 5 was so minimal we decided to jump from 0 to 4 in prod

I performed the exact same steps on the production cluster and changed
chooseleaf_vary_r to 4 however nothing happened, no rebalancing at all.

Update was done with

   ceph osd getcrushmap -o crushmap-bobtail
   crushtool -i crushmap-bobtail --set-chooseleaf-vary-r 4 -o crushmap-firefly
   ceph osd setcrushmap -i crushmap-firefly

I also decomplied and diff'ed the maps on occasion to confirm changes, I'm
relatively new to ceph, better safe than sorry :-)

tunables in prod prior to any change were
{
    "choose_local_tries": 0,
    "choose_local_fallback_tries": 0,
    "choose_total_tries": 50,
    "chooseleaf_descend_once": 1,
    "chooseleaf_vary_r": 0,
    "straw_calc_version": 0,
    "allowed_bucket_algs": 22,
    "profile": "bobtail",
    "optimal_tunables": 0,
    "legacy_tunables": 0,
    "require_feature_tunables": 1,
    "require_feature_tunables2": 1,
    "require_feature_tunables3": 0,
    "has_v2_rules": 0,
    "has_v3_rules": 0,
    "has_v4_buckets": 0
}

tunables in prod now show
{
    "choose_local_tries": 0,
    "choose_local_fallback_tries": 0,
    "choose_total_tries": 50,
    "chooseleaf_descend_once": 1,
    "chooseleaf_vary_r": 4,
    "straw_calc_version": 0,
    "allowed_bucket_algs": 22,
    "profile": "unknown",
    "optimal_tunables": 0,
    "legacy_tunables": 0,
    "require_feature_tunables": 1,
    "require_feature_tunables2": 1,
    "require_feature_tunables3": 1,
    "has_v2_rules": 0,
    "has_v3_rules": 0,
    "has_v4_buckets": 0
}

for ref in test they are now
{
    "choose_local_tries": 0,
    "choose_local_fallback_tries": 0,
    "choose_total_tries": 50,
    "chooseleaf_descend_once": 1,
    "chooseleaf_vary_r": 1,
    "straw_calc_version": 0,
    "allowed_bucket_algs": 22,
    "profile": "firefly",
    "optimal_tunables": 1,
    "legacy_tunables": 0,
    "require_feature_tunables": 1,
    "require_feature_tunables2": 1,
    "require_feature_tunables3": 1,
    "has_v2_rules": 0,
    "has_v3_rules": 0,
    "has_v4_buckets": 0
}

I'm worried that no rebalancing occurred - anyone any idea why ?

The goal here is to get ready to upgrade to jewel - anyone see any issues
from the above info ?

Thanks in advance,
Adrian.

-- 
---
Adrian : aussieade@xxxxxxxxx
If violence doesn't solve your problem, you're not using enough of it.

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com