On Fri, Sep 19, 2014 at 10:41 AM, Dan Van Der Ster <daniel.vanderster@xxxxxxx> wrote: >> On 19 Sep 2014, at 08:12, Florian Haas <florian@xxxxxxxxxxx> wrote: >> >> On Fri, Sep 19, 2014 at 12:27 AM, Sage Weil <sweil@xxxxxxxxxx> wrote: >>> On Fri, 19 Sep 2014, Florian Haas wrote: >>>> Hi Sage, >>>> >>>> was the off-list reply intentional? >>> >>> Whoops! Nope :) >>> >>>> On Thu, Sep 18, 2014 at 11:47 PM, Sage Weil <sweil@xxxxxxxxxx> wrote: >>>>>> So, disaster is a pretty good description. Would anyone from the core >>>>>> team like to suggest another course of action or workaround, or are >>>>>> Dan and I generally on the right track to make the best out of a >>>>>> pretty bad situation? >>>>> >>>>> The short term fix would probably be to just prevent backfill for the time >>>>> being until the bug is fixed. >>>> >>>> As in, osd max backfills = 0? >>> >>> Yeah :) >>> >>> Just managed to reproduce the problem... >>> >>> sage >> >> Saw the wip branch. Color me freakishly impressed on the turnaround. :) Thanks! > > Indeed :) Thanks Sage! > wip-9487-dumpling fixes the problem on my test cluster. Trying in prod now… Final update, after 4 hours in prod and after draining 8 OSDs -- zero slow requests :) Thanks again! Dan -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html