On 11/13/2012 06:15 AM, Gandalf Corvotempesta wrote:
Hi, what happens in case of OSD network failure? Is ceph smart enough to isolate OSDs not synced? Should I use LACP in ODS network or a single 10GBe per server should be ok? LACP will need stackable switches and much more hardware investment.
OSDs send heartbeats to each other and report failure to receive a heartbeat in a certain interval to the monitor cluster. When the monitor cluster receives enough of these reports, it marks the OSD 'down' in the OSD map, and after a grace period to allow for flapping or daemon restarts, marks the osd 'out' as well. This makes the cluster rebalance any data that was on the failed OSD, and places no new data there. A lot of this is configurable, but that's the basic model. In this model, a network failure is equivalent to extreme slowness or a crashed OSD - everything results in an updated map of the cluster eventually, and the OSDs maintain strong consistency of the data through the peering and recovery processes. So basically you'd only need a single nic per storage node. Multiple can be useful to separate frontend and backend traffic, but ceph is designed to maintain strong consistency when failures occur. Josh -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html