Hi Spandan, On Thu, 3 Aug 2017, Spandan Kumar Sahu wrote: > Sage > > I think it would be a good idea to include a command in the balancer > module itself, that would optimize the crushmap using the > python-crush, and set the optimized crushmap. > > As far as I believe, uneven distributions can be majorly attributed to > the factors: > * using an unoptimized crushmap > * unevenness that occurs due to the (pseudo) random nature of CRUSH > * objects having different sizes. > > If we set an optimized crushmap, at the very initial stages, we have > to move very less data in the due course, in order to maintain a > proper distribution. Hence the necessity of including it in the > balancer module. Please give a look at the PR[1], I sent in this > regard, and let me know if I am moving in the right direction. There are a few problems with using python-crush, the main one being that the dependencies are problematic: it's built from a forked repo and is not packaged properly (has to be installed with pip). It also may not match the CRUSH version being used by the cluster. The larger issue though is that it doesn't address all of the other problems I highlighted in my earlier email. The main thing it *does* to properly is it does the optimization based on a model; this was the main problem with the old reweight-by-utilization. The new framework in balancer.py has all the pieces now to let you do that. I think the main value in the python-crush optimize code is that it demonstrably works, which means we know that the cost/score fuction being used and the descent method work together. I think the best path forward is to look at the core of what those two pieces are doing and port it into the balancer environment. Most recently I've been working on the 'eval' method that will generate a score for a given distribution, but I'm working from first principles (just calculating the layout, its deviation from the target, the standard deviation, etc.) but I'm not sure what Loic's optimizer was doing. Also, my first attempt at a descent function to correct weights was pretty broken, and I know a lot of experimentation went into Loic's method. Do you see any problems with that approach, or things that the balancer framework does not cover? Thanks! sage -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html