On Fri, Jun 29, 2012 at 2:18 PM, Brian Edmonds <mornir@xxxxxxxxx> wrote: > On Fri, Jun 29, 2012 at 2:11 PM, Gregory Farnum <greg@xxxxxxxxxxx> wrote: >> Well, actually this depends on the filesystem you're using. With >> btrfs, the OSD will roll back to a consistent state, but you don't >> know how out-of-date that state is. > > Ok, so assuming btrfs, then a single machine failure with a ramdisk > journal should not result in any data loss, assuming replication is > working? The cluster would then be at risk of data loss primarily > from a full power outage. (In practice I'd expect either one machine > to die, or a power loss to take out all of them, and smaller but > non-unitary losses would be uncommon.) That's correct. And replication will be working — it's all synchronous, so if the replication isn't working, you won't be able to write. :) There are some edge cases here — if an OSD is "down" but not "out" then you might not have the same number of data copies as normal, but that's all configurable. > > Something to play with, perhaps. > > Brian. > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html