Re: OSD network failure

Josh Durgin <josh.durgin@xxxxxxxxxxx> · Thu, 15 Nov 2012 00:40:25 -0800

On 11/13/2012 06:15 AM, Gandalf Corvotempesta wrote:
Hi,
what happens in case of OSD network failure? Is ceph smart enough to
isolate OSDs not synced?
Should I use LACP in ODS network or a single 10GBe per server should be ok?

LACP will need stackable switches and much more hardware investment.

OSDs send heartbeats to each other and report failure to receive
a heartbeat in a certain interval to the monitor cluster.
When the monitor cluster receives enough of these reports,
it marks the OSD 'down' in the OSD map, and after a grace period
to allow for flapping or daemon restarts, marks the osd 'out'
as well. This makes the cluster rebalance any data that was on the
failed OSD, and places no new data there.

A lot of this is configurable, but that's the basic model.

In this model, a network failure is equivalent to extreme slowness or a
crashed OSD - everything results in an updated map of the cluster
eventually, and the OSDs maintain strong consistency of the data
through the peering and recovery processes.

So basically you'd only need a single nic per storage node. Multiple
can be useful to separate frontend and backend traffic, but ceph
is designed to maintain strong consistency when failures occur.

Josh
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html