Hi Pedro, I'm going to experiment with what you did at https://github.com/plafl/notebooks/blob/master/replication.ipynb and the latest python-crush published today. A comparison function was added that will help measure the data movement. I'm hoping we can release an offline tool based on your solution. Please let me know if I should wait before diving into this, in case you have unpublished drafts or new ideas. Cheers On 03/09/2017 09:47 AM, Pedro López-Adeva wrote: > Great, thanks for the clarifications. > I also think that the most natural way is to keep just a set of > weights in the CRUSH map and update them inside the algorithm. > > I keep working on it. > > > 2017-03-08 0:06 GMT+01:00 Sage Weil <sage@xxxxxxxxxxxx>: >> Hi Pedro, >> >> Thanks for taking a look at this! It's a frustrating problem and we >> haven't made much headway. >> >> On Thu, 2 Mar 2017, Pedro López-Adeva wrote: >>> Hi, >>> >>> I will have a look. BTW, I have not progressed that much but I have >>> been thinking about it. In order to adapt the previous algorithm in >>> the python notebook I need to substitute the iteration over all >>> possible devices permutations to iteration over all the possible >>> selections that crush would make. That is the main thing I need to >>> work on. >>> >>> The other thing is of course that weights change for each replica. >>> That is, they cannot be really fixed in the crush map. So the >>> algorithm inside libcrush, not only the weights in the map, need to be >>> changed. The weights in the crush map should reflect then, maybe, the >>> desired usage frequencies. Or maybe each replica should have their own >>> crush map, but then the information about the previous selection >>> should be passed to the next replica placement run so it avoids >>> selecting the same one again. >> >> My suspicion is that the best solution here (whatever that means!) >> leaves the CRUSH weights intact with the desired distribution, and >> then generates a set of derivative weights--probably one set for each >> round/replica/rank. >> >> One nice property of this is that once the support is added to encode >> multiple sets of weights, the algorithm used to generate them is free to >> change and evolve independently. (In most cases any change is >> CRUSH's mapping behavior is difficult to roll out because all >> parties participating in the cluster have to support any new behavior >> before it is enabled or used.) >> >>> I have a question also. Is there any significant difference between >>> the device selection algorithm description in the paper and its final >>> implementation? >> >> The main difference is the "retry_bucket" behavior was found to be a bad >> idea; any collision or failed()/overload() case triggers the >> retry_descent. >> >> There are other changes, of course, but I don't think they'll impact any >> solution we come with here (or at least any solution can be suitably >> adapted)! >> >> sage > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Loïc Dachary, Artisan Logiciel Libre -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html