compat weight reset

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

I am trying to find a simple way that might help me better distribute my data, as I wrap up my Nautilus upgrades.

Currently rebuilding some OSD's with bigger block.db to prevent BlueFS spillover where it isn't difficult to do so, and I'm once again struggling with unbalanced distribution, despite having used upmap balancer.

I recently discovered that my previous usage of the balancer module with crush-compat mode before the upmap mode has left some lingering compat weight sets, which I believe may account for my less than stellar distribution, as I now have 2-3 weightings fighting against each other (upmap balancer, compat weight set, reweight). Below is a snippet showing the compat differing.

$ ceph osd crush tree
ID  CLASS WEIGHT    (compat)  TYPE NAME
-55        43.70700  42.70894         chassis node2425
 -2        21.85399  20.90097             host node24
  0   hdd   7.28499   7.75699                 osd.0
  8   hdd   7.28499   6.85500                 osd.8
 16   hdd   7.28499   6.28899                 osd.16
 -3        21.85399  21.80797             host node25
  1   hdd   7.28499   7.32899                 osd.1
  9   hdd   7.28499   7.24399                 osd.9
 17   hdd   7.28499   7.23499                 osd.17

So my main question is how do I [re]set the compat value, to match the weight, so that the upmap balancer can more precisely balance the data?

It looks like I may have two options, with 
ceph osd crush weight-set reweight-compat {name} {weight}
or
ceph osd crush weight-set rm-compat

I assume the first would be to manage a single device/host/chassis/etc and the latter would nuke all compat values across the board?

And in looking at this, I started poking at my tunables, and I have no clue how to interpret the values, nor what I believe what they should be.

$ ceph osd crush show-tunables
{
    "choose_local_tries": 0,
    "choose_local_fallback_tries": 0,
    "choose_total_tries": 50,
    "chooseleaf_descend_once": 1,
    "chooseleaf_vary_r": 1,
    "chooseleaf_stable": 0,
    "straw_calc_version": 1,
    "allowed_bucket_algs": 22,
    "profile": "firefly",
    "optimal_tunables": 0,
    "legacy_tunables": 0,
    "minimum_required_version": "hammer",
    "require_feature_tunables": 1,
    "require_feature_tunables2": 1,
    "has_v2_rules": 0,
    "require_feature_tunables3": 1,
    "has_v3_rules": 0,
    "has_v4_buckets": 1,
    "require_feature_tunables5": 0,
    "has_v5_rules": 0
}

This is a Jewel -> Luminous -> Mimic -> Nautilus cluster, and pretty much all the clients support Jewel/Luminous+ feature sets (jewel clients are kernel-cephfs clients, even though recent (4.15-4.18) kernels).
$ ceph features | grep release
            "release": "luminous",
            "release": "luminous",
            "release": "luminous",
            "release": "jewel",
            "release": "jewel",
            "release": "luminous",
            "release": "luminous",
            "release": "luminous",
            "release": "luminous",

I feel like I should be running optimal tunables, but I believe I am running default?
Not sure how much of a difference exists there, or if that will trigger a bunch of data movement either.

Hopefully someone will be able to steer me in a positive direction here, and I can mostly trigger a single, large data movement and return to a happy, balanced cluster once again.

Thanks,

Reed

Attachment: smime.p7s
Description: S/MIME cryptographic signature

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux