On 05/12/2012 05:51 PM, Sage Weil wrote:
Hey Jim, These both look like reasonable changes. And it's great to see they fix behavior for you. I'm not going to merge them yet, though. We're just kicking off a CRUSH refresh project next week that will include some testing framework to more thoroughly validate the quality of the output, and also take a more holistic look at all what the algorithm is doing and see what we can improve.
I really didn't expect you'd merge them right away -- I knew you had this CRUSH effort coming, so my goal was to get these to you so you could evaluate them as part of that push.
Most likely these changes will be included, but revving the mapping algorithm is going to be tricky for forward/backward compatibility, and we'd like to get it all in at once. (And/or come up with a better way to deal with mismatched versions...)
Yep, I completely ignored the version compatibility issue - I didn't have any clever ideas on how to handle it. Also, FWIW I'm running with the patch below on top of the previous two - I think it helps avoid giving up too early in clusters where many OSDs have gone down/out, but I haven't done enough testing on it yet to quantify.
Thanks!
Hey, thanks for taking at look!
sage
-- Jim --- ceph: retry CRUSH map descents from root a little longer before falling back to exhaustive search The exhaustive search isn't as exhaustive as we'd like if the CRUSH map is several levels deep, so try a few more times to find an "in" device during "spread re-replication around" mode. This makes it less likely we'll give up when the storage cluster has many failed devices. Signed-off-by: Jim Schutt <jaschut@xxxxxxxxxx> --- src/crush/mapper.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/src/crush/mapper.c b/src/crush/mapper.c index 698da55..39e9c5d 100644 --- a/src/crush/mapper.c +++ b/src/crush/mapper.c @@ -306,7 +306,7 @@ static int crush_choose(const struct crush_map *map, int item = 0; int itemtype; int collide, reject; - const unsigned int orig_tries = 5; /* attempts before we fall back to search */ + const unsigned int orig_tries = 10; /* attempts before we fall back to search */ dprintk("CHOOSE%s bucket %d x %d outpos %d numrep %d\n", recurse_to_leaf ? "_LEAF" : "", bucket->id, x, outpos, numrep); @@ -440,7 +440,7 @@ reject: else if (flocal <= in->size + orig_tries) /* exhaustive bucket search */ retry_bucket = 1; - else if (ftotal < 20) + else if (ftotal <= orig_tries + 15) /* then retry descent */ retry_descent = 1; else -- 1.7.8.2 -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html