Re: Ceph balancer "Error EAGAIN: compat weight-set not available"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



ceph balancer status
ceph config-key dump | grep balancer
ceph osd dump | grep min_compat_client
ceph osd crush dump | grep straw
ceph osd crush dump | grep profile
ceph features

You didn't mention it, but based on your error and my experiences over the last week getting the balancer working, you're trying to use crush-compat.  Running all of those commands should give you the information you need to fix everything up for the balancer to work.  With the first 2, you need to make sure that you have your mode set properly as well as double check any other settings you're going for with the balancer.  Everything else stems off of a requirement of having your buckets being straw2 instead of straw for the balancer to work.  I'm sure you'll notice that your cluster has older compatibility requirements and crush profile than hammer and that your buckets are using the straw algorithm instead of straw2.

Running [1] these commands will fix up your cluster so that you are now using straw2 and have your minimum required clients and profile to hammer which is the ceph release that introduced straw2.  Before running these commands make sure that the output of `ceph features` does not show any firefly clients connected to your cluster.  If you do have any, it is likely due to outdated kernels or clients installed without the upstream ceph repo and just using the version of ceph in the canonical repos or similar for your distribution.  If you do happen to have any firefly, or older, clients connected to your cluster, then you need to update those clients before running the commands.

There will be some data movement, but I didn't see more than ~5% data movement on any of the 8 clusters I ran them on.  That data movement will be higher if you do not have a standard size of OSD drive in your clusters like some 2TB disks and some 8TB disks across your cluster will probably cause some more data movement then I saw, but it should still be within reason.  This data movement is because straw2 can handle that situation better than straw did and will allow your cluster to better balance itself even without the balancer module.

If you don't even have any hammer clients, then go ahead and set the min-compat-client to jewel as well as the crush tunables to jewel.  Setting them to Jewel will cause a bit more data movement, but again for good reasons.

The tl;dr of your error is that your cluster has been running since at least hammer which started with older default settings than are required by the balancer module.  As you've updated your cluster you didn't allow it to utilize new features in the backend by leaving your crush tunables alone during all of the upgrades to new versions.  To learn more about the changes to the crush tunables you can check out the ceph wiki [2] here.

[1]
ceph osd set-require-min-compat-client hammer
ceph osd crush set-all-straw-buckets-to-straw2
ceph osd crush tunables hammer

[2] http://docs.ceph.com/docs/master/rados/operations/crush-map/

On Tue, Sep 11, 2018 at 6:24 AM Marc Roos <M.Roos@xxxxxxxxxxxxxxxxx> wrote:

I am new, with using the balancer, I think this should generated a plan
not? Do not get what this error is about.


[@c01 ~]# ceph balancer optimize balancer-test.plan
Error EAGAIN: compat weight-set not available
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux