On Thu, 8 Jan 2015 11:41:37 -0700 Robert LeBlanc wrote: > On Wed, Jan 7, 2015 at 10:55 PM, Christian Balzer <chibi@xxxxxxx> wrote: > > Which of course begs the question of why not having min_size at 1 > > permanently, so that in the (hopefully rare) case of loosing 2 OSDs at > > the same time your cluster still keeps working (as it should with a > > size of 3). > > The idea is that when a write happens at least min_size has it > committed on disk before the write is committed back to the client. > Just in case something happens to the disk before it can be > replicated. It also goes against the strongly consistent model of > Ceph. > Which of course currently means a strongly consistent lockup in these scenarios. ^o^ Slightly off-topic and snarky, that strong consistency is of course of limited use when in the case of a corrupted PG Ceph basically asks you to toss a coin. As in minor corruption, impossible for a mere human to tell which replica is the good one, because one OSD is down and the 2 remaining ones differ by one bit or so. > I believe there is work to resolve the issue when the number of > replicas drops below min_number. Ceph should automatically start > backfilling to get to at least min_num so that I/O can continue. I > believe this work is also tied to prioritizing backfilling so that > things like this are backfilled first, then backfilling min_num to get > back to size. > Yeah, I suppose that is what Greg referred to. Hopefully soon and backported if possible. > I am interested in a not-so-strict eventual consistency option in Ceph > so that under normal circumstances instead of needing [size] writes to > OSDs to complete, only [min_num] is needed and the primary OSD then > ensures that the laggy OSD(s) eventually gets the write committed. > This is exactly where I was coming from/getting at. And basically what artificially setting min size to 1 in a replica 3 cluster should get you, unless I'm missing something. Christian -- Christian Balzer Network/Systems Engineer chibi@xxxxxxx Global OnLine Japan/Fusion Communications http://www.gol.com/ _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com