Re: Crash of almost full ceph

Gregory Farnum <greg@xxxxxxxxxxx> · Tue, 7 Aug 2012 12:35:26 -0700

On Mon, Aug 6, 2012 at 11:23 PM, Vladimir Bashkirtsev
<vladimir@xxxxxxxxxxxxxxx> wrote:
>> Oh, I see what you're saying. Given how distributed Ceph is this is
>> actually harder than it sounds — we could get closer by refusing to
>> mark OSDs out whenever the full list is non-empty, but we could not
>> for instance do partial recovery and then stop once an OSD gets full.
>> In any case, I've made a bug (http://tracker.newdream.net/issues/2911)
>> since this isn't something I can hack together right now. :)
>> -Greg
>
> Refusing to mark OSDs out when full list is non-empty will definitely be a
> big step in right direction. It will prevent cascading failure I have
> described originally. But on other hand I think the way around complication
> of distributed nature is to have OSD itself to refuse backfill when it has
> reached full state regardless of where backfill coming from. In this case
> backfill will stall while clients activity still will be handled as per
> normal. Or OSD does not distinguish backfill requests from other OSDs and
> clients requests? Question of course is what other OSD will do when its peer
> refuses to accept backfill while it is still marked as in. There some
> stand-off period required on OSD which sends backfill. So it looks like OSD
> should have two things:
> 1. When receiving backfill check if already full and if it is drop request,
> don't ack back.
> 2. When sending backfill if ack did not arrive in reasonable time retry
> later after some time has passed (something tells me that such functionality
> already in place).
>
> I should admit I have not read ceph code but with all experience which I
> have got with ceph it seems that it should be fairly easy to implement.

That's one possibility, but it has a lot of side effects which could
be troubling — for instance, it means that pg_temp entries go from
being around until backfill completes, to being around until the OSD
is no longer full and backfill completes. The chances for map growth
etc are high and worrying.

Do you have any thoughts, Sam?
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html