Re: no rebalance when changing chooseleaf_vary_r tunable

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Gregory

We were planning on going to chooseleaf_vary_r=4 so we could upgrade to jewel now and schedule the change to 1 at a more suitable time since we were expecting a large rebalancing of objects (should have mentioned that).

Good to know that there's a valid reason we didn't see any rebalance though, had me worried so thanks for the info.

Regards,
Adrian.

On Thu, Apr 5, 2018 at 9:16 AM, Gregory Farnum <gfarnum@xxxxxxxxxx> wrote:
http://docs.ceph.com/docs/master/rados/operations/crush-map/#firefly-crush-tunables3

"The optimal value (in terms of computational cost and correctness) is 1."

I think you're just finding that the production cluster, with a much
larger number of buckets, didn't ever run in to the situation
chooseleaf_vary_r is meant to resolve, so it didn't change any
mappings by turning it on.
-Greg

On Wed, Apr 4, 2018 at 3:49 PM, Adrian <aussieade@xxxxxxxxx> wrote:
> Hi all,
>
> Was wondering if someone could enlighten me...
>
> I've recently been upgrading a small test clusters tunables from bobtail to
> firefly prior to doing the same with an old production cluster.
>
> OS is rhel 7.4, kernel in test is all 3.10.0-693.el7.x86_64, in prod admin
> box
> is 3.10.0-693.el7.x86_64 all mons and osds are 4.4.76-1.el7.elrepo.x86_64
>
> Ceph version is 0.94.10-0.el7, both were installed with ceph-deploy 5.37-0
>
> Production system was originally redhat ceph but was then changed to
> ceph-community edition (all prior to me being here) and has 189 osds
> on 21 hosts with 5 mons
>
> I changed chooseleaf_vary_r from 0 in test incrementally from 5 to 1,
> each change saw a larger rebalance than the last
>
>    |-------------------+----------+-----------|
>    | chooseleaf_vary_r | degraded | misplaced |
>    |-------------------+----------+-----------|
>    |                 5 |       0% |    0.187% |
>    |                 4 |   1.913% |    2.918% |
>    |                 3 |   6.965% |   18.904% |
>    |                 2 |  14.303% |   32.380% |
>    |                 1 |  20.657% |   48.310% |
>    |-------------------+----------+-----------|
>
> As the change to 5 was so minimal we decided to jump from 0 to 4 in prod
>
> I performed the exact same steps on the production cluster and changed
> chooseleaf_vary_r to 4 however nothing happened, no rebalancing at all.
>
> Update was done with
>
>    ceph osd getcrushmap -o crushmap-bobtail
>    crushtool -i crushmap-bobtail --set-chooseleaf-vary-r 4 -o
> crushmap-firefly
>    ceph osd setcrushmap -i crushmap-firefly
>
> I also decomplied and diff'ed the maps on occasion to confirm changes, I'm
> relatively new to ceph, better safe than sorry :-)
>
>
> tunables in prod prior to any change were
> {
>     "choose_local_tries": 0,
>     "choose_local_fallback_tries": 0,
>     "choose_total_tries": 50,
>     "chooseleaf_descend_once": 1,
>     "chooseleaf_vary_r": 0,
>     "straw_calc_version": 0,
>     "allowed_bucket_algs": 22,
>     "profile": "bobtail",
>     "optimal_tunables": 0,
>     "legacy_tunables": 0,
>     "require_feature_tunables": 1,
>     "require_feature_tunables2": 1,
>     "require_feature_tunables3": 0,
>     "has_v2_rules": 0,
>     "has_v3_rules": 0,
>     "has_v4_buckets": 0
> }
>
> tunables in prod now show
> {
>     "choose_local_tries": 0,
>     "choose_local_fallback_tries": 0,
>     "choose_total_tries": 50,
>     "chooseleaf_descend_once": 1,
>     "chooseleaf_vary_r": 4,
>     "straw_calc_version": 0,
>     "allowed_bucket_algs": 22,
>     "profile": "unknown",
>     "optimal_tunables": 0,
>     "legacy_tunables": 0,
>     "require_feature_tunables": 1,
>     "require_feature_tunables2": 1,
>     "require_feature_tunables3": 1,
>     "has_v2_rules": 0,
>     "has_v3_rules": 0,
>     "has_v4_buckets": 0
> }
>
> for ref in test they are now
> {
>     "choose_local_tries": 0,
>     "choose_local_fallback_tries": 0,
>     "choose_total_tries": 50,
>     "chooseleaf_descend_once": 1,
>     "chooseleaf_vary_r": 1,
>     "straw_calc_version": 0,
>     "allowed_bucket_algs": 22,
>     "profile": "firefly",
>     "optimal_tunables": 1,
>     "legacy_tunables": 0,
>     "require_feature_tunables": 1,
>     "require_feature_tunables2": 1,
>     "require_feature_tunables3": 1,
>     "has_v2_rules": 0,
>     "has_v3_rules": 0,
>     "has_v4_buckets": 0
> }
>
> I'm worried that no rebalancing occurred - anyone any idea why ?
>
> The goal here is to get ready to upgrade to jewel - anyone see any issues
> from the above info ?
>
> Thanks in advance,
> Adrian.
>
> --
> ---
> Adrian : aussieade@xxxxxxxxx
> If violence doesn't solve your problem, you're not using enough of it.
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



--
---
Adrian : aussieade@xxxxxxxxx
If violence doesn't solve your problem, you're not using enough of it.
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux