Thanks David, for this detailed explanation, it is spot on!!! I missed your email but got it from here. https://www.spinics.net/lists/ceph-users/msg47780.html ceph balancer status ceph config-key dump | grep balancer ceph osd dump | grep min_compat_client ceph osd crush dump | grep straw ceph osd crush dump | grep profile ceph features You didn't mention it, but based on your error and my experiences over the last week getting the balancer working, you're trying to use crush-compat. Running all of those commands should give you the information you need to fix everything up for the balancer to work. With the first 2, you need to make sure that you have your mode set properly as well as double check any other settings you're going for with the balancer. Everything else stems off of a requirement of having your buckets being straw2 instead of straw for the balancer to work. I'm sure you'll notice that your cluster has older compatibility requirements and crush profile than hammer and that your buckets are using the straw algorithm instead of straw2. Running [1] these commands will fix up your cluster so that you are now using straw2 and have your minimum required clients and profile to hammer which is the ceph release that introduced straw2. Before running these commands make sure that the output of `ceph features` does not show any firefly clients connected to your cluster. If you do have any, it is likely due to outdated kernels or clients installed without the upstream ceph repo and just using the version of ceph in the canonical repos or similar for your distribution. If you do happen to have any firefly, or older, clients connected to your cluster, then you need to update those clients before running the commands. There will be some data movement, but I didn't see more than ~5% data movement on any of the 8 clusters I ran them on. That data movement will be higher if you do not have a standard size of OSD drive in your clusters like some 2TB disks and some 8TB disks across your cluster will probably cause some more data movement then I saw, but it should still be within reason. This data movement is because straw2 can handle that situation better than straw did and will allow your cluster to better balance itself even without the balancer module. If you don't even have any hammer clients, then go ahead and set the min-compat-client to jewel as well as the crush tunables to jewel. Setting them to Jewel will cause a bit more data movement, but again for good reasons. The tl;dr of your error is that your cluster has been running since at least hammer which started with older default settings than are required by the balancer module. As you've updated your cluster you didn't allow it to utilize new features in the backend by leaving your crush tunables alone during all of the upgrades to new versions. To learn more about the changes to the crush tunables you can check out the ceph wiki [2] here. [1] ceph osd set-require-min-compat-client hammer ceph osd crush set-all-straw-buckets-to-straw2 ceph osd crush tunables hammer [2] http://docs.ceph.com/docs/master/rados/operations/crush-map/ -----Original Message----- From: Marc Roos Sent: dinsdag 11 september 2018 12:24 To: ceph-users Subject: Ceph balancer "Error EAGAIN: compat weight-set not available" I am new, with using the balancer, I think this should generated a plan not? Do not get what this error is about. [@c01 ~]# ceph balancer optimize balancer-test.plan Error EAGAIN: compat weight-set not available _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com