Re: Designing a cluster guide

Gregory Farnum <greg@xxxxxxxxxxx> · Fri, 29 Jun 2012 14:30:09 -0700

On Fri, Jun 29, 2012 at 2:18 PM, Brian Edmonds <mornir@xxxxxxxxx> wrote:
> On Fri, Jun 29, 2012 at 2:11 PM, Gregory Farnum <greg@xxxxxxxxxxx> wrote:
>> Well, actually this depends on the filesystem you're using. With
>> btrfs, the OSD will roll back to a consistent state, but you don't
>> know how out-of-date that state is.
>
> Ok, so assuming btrfs, then a single machine failure with a ramdisk
> journal should not result in any data loss, assuming replication is
> working?  The cluster would then be at risk of data loss primarily
> from a full power outage.  (In practice I'd expect either one machine
> to die, or a power loss to take out all of them, and smaller but
> non-unitary losses would be uncommon.)

That's correct. And replication will be working — it's all
synchronous, so if the replication isn't working, you won't be able to
write. :) There are some edge cases here — if an OSD is "down" but not
"out" then you might not have the same number of data copies as
normal, but that's all configurable.

>
> Something to play with, perhaps.
>
> Brian.
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html