On Fri, 19 Sep 2014, Dan van der Ster wrote: > On Fri, Sep 19, 2014 at 10:41 AM, Dan Van Der Ster > <daniel.vanderster@xxxxxxx> wrote: > >> On 19 Sep 2014, at 08:12, Florian Haas <florian@xxxxxxxxxxx> wrote: > >> > >> On Fri, Sep 19, 2014 at 12:27 AM, Sage Weil <sweil@xxxxxxxxxx> wrote: > >>> On Fri, 19 Sep 2014, Florian Haas wrote: > >>>> Hi Sage, > >>>> > >>>> was the off-list reply intentional? > >>> > >>> Whoops! Nope :) > >>> > >>>> On Thu, Sep 18, 2014 at 11:47 PM, Sage Weil <sweil@xxxxxxxxxx> wrote: > >>>>>> So, disaster is a pretty good description. Would anyone from the core > >>>>>> team like to suggest another course of action or workaround, or are > >>>>>> Dan and I generally on the right track to make the best out of a > >>>>>> pretty bad situation? > >>>>> > >>>>> The short term fix would probably be to just prevent backfill for the time > >>>>> being until the bug is fixed. > >>>> > >>>> As in, osd max backfills = 0? > >>> > >>> Yeah :) > >>> > >>> Just managed to reproduce the problem... > >>> > >>> sage > >> > >> Saw the wip branch. Color me freakishly impressed on the turnaround. :) Thanks! > > > > Indeed :) Thanks Sage! > > wip-9487-dumpling fixes the problem on my test cluster. Trying in prod now? > > Final update, after 4 hours in prod and after draining 8 OSDs -- zero > slow requests :) That's great news! But, please be careful. This code hasn't been reiewed yet or been through any testing! I would hold off on further backfills until it's merged. Thanks! sage -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html