On Wed, 12 Apr 2017, Loic Dachary wrote: > Hi Sage, > > Assuming the weight set is implemented in crush[1], it could be used > with a weight_set matrix for each pool, encoded/decoded with the > crushmap[2]. If the weight_set for a given pool does not exist the > legacy behavior is preserved. Whenever a pool size, pgp_num changes, the > weight_set should also be updated simultaneously. > > Currently CrushWrapper re-creates the workspace for each call to > crush_do_rule which is expensive. It could instead be cached. The > workspace is created with the replication count as an argument and could > then be populated with the weight_set for the corresponding pool which > is assumed to be in sync with the pool state (size, pgp_num etc). Hmm. The current workspace is meant to be a private temporary space that is not shared across threads (this allows the crush mapping functions to be const and called on a const OSDMap/CrushWrapper without any locks). We should keep its setup as fast/cheap as possible... I think the weight set is also effectively const, which suggests that it probably shouldn't be the same structure after all... sage > > This would be a very loose implementation of the weight_set relying on > external tools to be in sync. The next step would be to add commands on > top of that to make it less error prone. > > Do you think we should come up with a robust use case / cli / syntax > before implementing this first step ? Or does it look like a sound first > step ? > > Cheers > > [1] http://libcrush.org/main/libcrush/merge_requests/30/diffs > [2] https://github.com/ceph/ceph/pull/14486 > [3] https://github.com/ceph/ceph/pull/14486/files#diff-215e8db28af558a5e445f6267530c786R1179 > > -- > Loïc Dachary, Artisan Logiciel Libre > >