On Sun, Sep 21, 2014 at 9:52 PM, Sage Weil <sweil@xxxxxxxxxx> wrote: > On Sun, 21 Sep 2014, Florian Haas wrote: >> So yes, I think your patch absolutely still has merit, as would any >> means of reducing the number of snapshots an OSD will trim in one go. >> As it is, the situation looks really really bad, specifically >> considering that RBD and RADOS are meant to be super rock solid, as >> opposed to say CephFS which is in an experimental state. And contrary >> to CephFS snapshots, I can't recall any documentation saying that RBD >> snapshots will break your system. > > Yeah, it sounds like a separate issue, and no, the limit is not > documented because it's definitely not the intended behavior. :) > > ...and I see you already have a log attached to #9503. Will take a look. I've already updated that issue in Redmine, but for the list archives I should also add this here: Dan's patch for #9503, together with Sage's for #9487, makes the problem go away in an instant. I've already pointed out that I owe Dan dinner, and Sage, well I already owe Sage pretty much lifelong full board. :) Everyone with a ton of snapshots in their clusters (not sure where the threshold is, but it gets nasty somewhere between 1,000 and 10,000 I imagine) should probably update to 0.67.11 and 0.80.6 as soon as they come out, otherwise Terrible Things Will Happen™ if you're ever forced to delete a large number of snaps at once. Thanks again to Dan and Sage, Florian -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html