Full OSD halting a cluster - isn't this violating the "no single point of failure" promise?

Christian Theune <ct@xxxxxxxxxxxxxxx> · Fri, 16 Sep 2016 20:35:58 +0200

Hi,
(just in case: this isn’t intended as a rant and I hope it doesn’t get read at it. I’m trying to understand what some perspectives towards potential future improvements are and I think it would be valuable to have this discoverable in the archives)

We’ve had a “good" time recently balancing our growing cluster and did a lot of reweighting after a full OSD actually did bite us once. 

Apart from paying our dues (tight monitoring, reweighting and generally hedging the cluster) I was wondering whether this behaviour is a violation of the “no single point of failure” promise: independent of how big your setup grows, a single OSD can halt practically everything. Even just stopping the OSD would unblock your cluster (assuming that Crush made a particular pathological choice and that 1 OSD being extremely off the curve compared to the others) and keep going.

I haven’t found much whether this is “it’s the way it is and we don’t see a way forward” or whether this behaviour is considered something that could be improved in the future and whether there are strategies around already?

From my perspective this is directly related to how well Crush weighting works with respect to placing data evenly. (I would expect that in certain situations like a single RBD cluster where all objects are identically sized that this should be something that Crush can perform well in, but my last weeks tells me that isn’t the case. :) )

An especially interesting edge case is if your cluster consists of 2 pools where each runs using a completely disjoint set of OSDs: I guess it’s an accidental (not intentional) behaviour that the one pool would be affecting the other, right?

Thoughts?

Hugs,
Christian

-- 
Christian Theune · ct@xxxxxxxxxxxxxxx · +49 345 219401 0
Flying Circus Internet Operations GmbH · http://flyingcircus.io
Forsterstraße 29 · 06112 Halle (Saale) · Deutschland
HR Stendal HRB 21169 · Geschäftsführer: Christian. Theune, Christian. Zagrodnick

Attachment:
signature.asc

Description: Message signed with OpenPGP using GPGMail
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com