Re: Crash of almost full ceph

Gregory Farnum <greg@xxxxxxxxxxx> · Mon, 6 Aug 2012 09:53:59 -0700



On Mon, Aug 6, 2012 at 9:39 AM, Vladimir Bashkirtsev
<vladimir@xxxxxxxxxxxxxxx> wrote:
> On 07/08/12 01:55, Gregory Farnum wrote:
>>
>> There is not yet any such feature, no — dealing with full systems is
>> notoriously hard and we haven't come up with a great solution yet. One thing
>> you can do is experiment with the "mon_osd_min_in_ratio" parameter, which
>> prevents the monitors from marking out more than a certain percentage of the
>> OSD cluster (and without something being marked out, no data will be moved
>> around). If you don't want the cluster to automatically mark any OSDs out,
>> you can also set the "mon_osd_down_out_interval" to zero. -Greg
>
> But it is good idea to have such feature as fail safe device. Settings you
> speak about may help a bit when cluster is almost full and there good number
> of OSDs but hard refusal of ceph to run recovery if ANY live OSD is over
> certain limit is quite unambiguous. If recovery fails due to one OSD is at
> capacity then it should be handed over to admin to decide what to do:
> rebalance CRUSH, add new OSD, remove some objects. Certainly ceph should not
> be able to fill up OSD with activity which is not required (but desired) by
> end clients.

Oh, I see what you're saying. Given how distributed Ceph is this is
actually harder than it sounds — we could get closer by refusing to
mark OSDs out whenever the full list is non-empty, but we could not
for instance do partial recovery and then stop once an OSD gets full.
In any case, I've made a bug (http://tracker.newdream.net/issues/2911)
since this isn't something I can hack together right now. :)
-Greg
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html