On Mon, Aug 6, 2012 at 9:39 AM, Vladimir Bashkirtsev <vladimir@xxxxxxxxxxxxxxx> wrote: > On 07/08/12 01:55, Gregory Farnum wrote: >> >> There is not yet any such feature, no — dealing with full systems is >> notoriously hard and we haven't come up with a great solution yet. One thing >> you can do is experiment with the "mon_osd_min_in_ratio" parameter, which >> prevents the monitors from marking out more than a certain percentage of the >> OSD cluster (and without something being marked out, no data will be moved >> around). If you don't want the cluster to automatically mark any OSDs out, >> you can also set the "mon_osd_down_out_interval" to zero. -Greg > > But it is good idea to have such feature as fail safe device. Settings you > speak about may help a bit when cluster is almost full and there good number > of OSDs but hard refusal of ceph to run recovery if ANY live OSD is over > certain limit is quite unambiguous. If recovery fails due to one OSD is at > capacity then it should be handed over to admin to decide what to do: > rebalance CRUSH, add new OSD, remove some objects. Certainly ceph should not > be able to fill up OSD with activity which is not required (but desired) by > end clients. Oh, I see what you're saying. Given how distributed Ceph is this is actually harder than it sounds — we could get closer by refusing to mark OSDs out whenever the full list is non-empty, but we could not for instance do partial recovery and then stop once an OSD gets full. In any case, I've made a bug (http://tracker.newdream.net/issues/2911) since this isn't something I can hack together right now. :) -Greg -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html