Hello, On Wed, 14 Dec 2016 00:06:14 +0100 Kevin Olbrich wrote: > Ok, thanks for your explanation! > I read those warnings about size 2 + min_size 1 (we are using ZFS as RAID6, > called zraid2) as OSDs. > This is similar to my RAID6 or RAID10 backed OSDs with regards to having very resilient, extremely unlikely to fail OSDs. As such a Ceph replication of 2 with min_size is a calculated risk, acceptable for me on others in certain use cases. This is also with very few (2-3) journals per SSD. If: 1. Your journal SSDs are well trusted and monitored (Intel DC S36xx, 37xx) 2. Your failure domain represented by a journal SSD is small enough (meaning that replicating the lost OSDs can be done quickly) it may be an acceptable risk for you as well. > Time to raise replication! > If you can afford that (money, space, latency), definitely go for it. Christian > Kevin > > 2016-12-13 0:00 GMT+01:00 Christian Balzer <chibi@xxxxxxx>: > > > On Mon, 12 Dec 2016 22:41:41 +0100 Kevin Olbrich wrote: > > > > > Hi, > > > > > > just in case: What happens when all replica journal SSDs are broken at > > once? > > > > > That would be bad, as in BAD. > > > > In theory you just "lost" all the associated OSDs and their data. > > > > In practice everything but in the in-flight data at the time is still on > > the actual OSDs (HDDs), but it's inconsistent and inaccessible as far as > > Ceph is concerned. > > > > So with some trickery and an experienced data-recovery Ceph consultant you > > _may_ get things running with limited data loss/corruption, but that's > > speculation and may be wishful thinking on my part. > > > > Another data point to deploy only well known/monitored/trusted SSDs and > > have a 3x replication. > > > > > The PGs most likely will be stuck inactive but as I read, the journals > > just > > > need to be replaced (http://ceph.com/planet/ceph-recover-osds-after-ssd- > > > journal-failure/). > > > > > > Does this also work in this case? > > > > > Not really, no. > > > > The above works by having still a valid state and operational OSDs from > > which the "broken" one can recover. > > > > Christian > > -- > > Christian Balzer Network/Systems Engineer > > chibi@xxxxxxx Global OnLine Japan/Rakuten Communications > > http://www.gol.com/ > > -- Christian Balzer Network/Systems Engineer chibi@xxxxxxx Global OnLine Japan/Rakuten Communications http://www.gol.com/ _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com