Hi all,
Was wondering if someone could enlighten me...
I've recently been upgrading a small test clusters tunables from bobtail to
firefly prior to doing the same with an old production cluster.
OS is rhel 7.4, kernel in test is all 3.10.0-693.el7.x86_64, in prod admin box
is 3.10.0-693.el7.x86_64 all mons and osds are 4.4.76-1.el7.elrepo.x86_64
Ceph version is 0.94.10-0.el7, both were installed with ceph-deploy 5.37-0
Production system was originally redhat ceph but was then changed to
ceph-community edition (all prior to me being here) and has 189 osds
on 21 hosts with 5 mons
I changed chooseleaf_vary_r from 0 in test incrementally from 5 to 1,
each change saw a larger rebalance than the last
|-------------------+----------+-----------|
| chooseleaf_vary_r | degraded | misplaced |
|-------------------+----------+-----------|
| 5 | 0% | 0.187% |
| 4 | 1.913% | 2.918% |
| 3 | 6.965% | 18.904% |
| 2 | 14.303% | 32.380% |
| 1 | 20.657% | 48.310% |
|-------------------+----------+-----------|
As the change to 5 was so minimal we decided to jump from 0 to 4 in prod
I performed the exact same steps on the production cluster and changed
chooseleaf_vary_r to 4 however nothing happened, no rebalancing at all.
Update was done with
ceph osd getcrushmap -o crushmap-bobtail
crushtool -i crushmap-bobtail --set-chooseleaf-vary-r 4 -o crushmap-firefly
ceph osd setcrushmap -i crushmap-firefly
I also decomplied and diff'ed the maps on occasion to confirm changes, I'm
relatively new to ceph, better safe than sorry :-)
tunables in prod prior to any change were
{
"choose_local_tries": 0,
"choose_local_fallback_tries": 0,
"choose_total_tries": 50,
"chooseleaf_descend_once": 1,
"chooseleaf_vary_r": 0,
"straw_calc_version": 0,
"allowed_bucket_algs": 22,
"profile": "bobtail",
"optimal_tunables": 0,
"legacy_tunables": 0,
"require_feature_tunables": 1,
"require_feature_tunables2": 1,
"require_feature_tunables3": 0,
"has_v2_rules": 0,
"has_v3_rules": 0,
"has_v4_buckets": 0
}
tunables in prod now show
{
"choose_local_tries": 0,
"choose_local_fallback_tries": 0,
"choose_total_tries": 50,
"chooseleaf_descend_once": 1,
"chooseleaf_vary_r": 4,
"straw_calc_version": 0,
"allowed_bucket_algs": 22,
"profile": "unknown",
"optimal_tunables": 0,
"legacy_tunables": 0,
"require_feature_tunables": 1,
"require_feature_tunables2": 1,
"require_feature_tunables3": 1,
"has_v2_rules": 0,
"has_v3_rules": 0,
"has_v4_buckets": 0
}
for ref in test they are now
{
"choose_local_tries": 0,
"choose_local_fallback_tries": 0,
"choose_total_tries": 50,
"chooseleaf_descend_once": 1,
"chooseleaf_vary_r": 1,
"straw_calc_version": 0,
"allowed_bucket_algs": 22,
"profile": "firefly",
"optimal_tunables": 1,
"legacy_tunables": 0,
"require_feature_tunables": 1,
"require_feature_tunables2": 1,
"require_feature_tunables3": 1,
"has_v2_rules": 0,
"has_v3_rules": 0,
"has_v4_buckets": 0
}
I'm worried that no rebalancing occurred - anyone any idea why ?
The goal here is to get ready to upgrade to jewel - anyone see any issues
from the above info ?
Thanks in advance,
Adrian.
--
Was wondering if someone could enlighten me...
I've recently been upgrading a small test clusters tunables from bobtail to
firefly prior to doing the same with an old production cluster.
OS is rhel 7.4, kernel in test is all 3.10.0-693.el7.x86_64, in prod admin box
is 3.10.0-693.el7.x86_64 all mons and osds are 4.4.76-1.el7.elrepo.x86_64
Ceph version is 0.94.10-0.el7, both were installed with ceph-deploy 5.37-0
Production system was originally redhat ceph but was then changed to
ceph-community edition (all prior to me being here) and has 189 osds
on 21 hosts with 5 mons
I changed chooseleaf_vary_r from 0 in test incrementally from 5 to 1,
each change saw a larger rebalance than the last
|-------------------+---------
| chooseleaf_vary_r | degraded | misplaced |
|-------------------+---------
| 5 | 0% | 0.187% |
| 4 | 1.913% | 2.918% |
| 3 | 6.965% | 18.904% |
| 2 | 14.303% | 32.380% |
| 1 | 20.657% | 48.310% |
|-------------------+---------
As the change to 5 was so minimal we decided to jump from 0 to 4 in prod
I performed the exact same steps on the production cluster and changed
chooseleaf_vary_r to 4 however nothing happened, no rebalancing at all.
Update was done with
ceph osd getcrushmap -o crushmap-bobtail
crushtool -i crushmap-bobtail --set-chooseleaf-vary-r 4 -o crushmap-firefly
ceph osd setcrushmap -i crushmap-firefly
I also decomplied and diff'ed the maps on occasion to confirm changes, I'm
relatively new to ceph, better safe than sorry :-)
tunables in prod prior to any change were
{
"choose_local_tries": 0,
"choose_local_fallback_tries": 0,
"choose_total_tries": 50,
"chooseleaf_descend_once": 1,
"chooseleaf_vary_r": 0,
"straw_calc_version": 0,
"allowed_bucket_algs": 22,
"profile": "bobtail",
"optimal_tunables": 0,
"legacy_tunables": 0,
"require_feature_tunables": 1,
"require_feature_tunables2": 1,
"require_feature_tunables3": 0,
"has_v2_rules": 0,
"has_v3_rules": 0,
"has_v4_buckets": 0
}
tunables in prod now show
{
"choose_local_tries": 0,
"choose_local_fallback_tries": 0,
"choose_total_tries": 50,
"chooseleaf_descend_once": 1,
"chooseleaf_vary_r": 4,
"straw_calc_version": 0,
"allowed_bucket_algs": 22,
"profile": "unknown",
"optimal_tunables": 0,
"legacy_tunables": 0,
"require_feature_tunables": 1,
"require_feature_tunables2": 1,
"require_feature_tunables3": 1,
"has_v2_rules": 0,
"has_v3_rules": 0,
"has_v4_buckets": 0
}
for ref in test they are now
{
"choose_local_tries": 0,
"choose_local_fallback_tries": 0,
"choose_total_tries": 50,
"chooseleaf_descend_once": 1,
"chooseleaf_vary_r": 1,
"straw_calc_version": 0,
"allowed_bucket_algs": 22,
"profile": "firefly",
"optimal_tunables": 1,
"legacy_tunables": 0,
"require_feature_tunables": 1,
"require_feature_tunables2": 1,
"require_feature_tunables3": 1,
"has_v2_rules": 0,
"has_v3_rules": 0,
"has_v4_buckets": 0
}
I'm worried that no rebalancing occurred - anyone any idea why ?
The goal here is to get ready to upgrade to jewel - anyone see any issues
from the above info ?
Thanks in advance,
Adrian.
--
---
Adrian : aussieade@xxxxxxxxx
If violence doesn't solve your problem, you're not using enough of it.
Adrian : aussieade@xxxxxxxxx
If violence doesn't solve your problem, you're not using enough of it.
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com