On Fri, Nov 16, 2012 at 5:56 PM, Josh Durgin <josh.durgin@xxxxxxxxxxx> wrote: > On 11/15/2012 01:51 AM, Gandalf Corvotempesta wrote: >> >> 2012/11/15 Josh Durgin <josh.durgin@xxxxxxxxxxx>: >>> >>> So basically you'd only need a single nic per storage node. Multiple >>> can be useful to separate frontend and backend traffic, but ceph >>> is designed to maintain strong consistency when failures occur. >> >> >> Probably i've not exaplained well. >> I'll have multiple nics, one for frontend, one for backend used as ODS >> sync network. >> What happens in case of backend network failure? The frontend network >> is still ok, OSD is >> still reachable but is not able to sync datas. > > > Ah, ok. By default, the OSDs use the backend network for heartbeats, > so if it fails, they will notice and report peers they can't reach as > failed to the monitors, and the normal failure handling takes care > of things. > > If you're worried about consistency, remember that a write won't > complete until it's on disk on all replicas. If you're interested > in the gory details of maintaining consistency, check out the peering > process [1]. > > Josh > > [1] http://ceph.com/docs/master/dev/peering/ Actually, right now a failed cluster and an up public network is something the OSDs do not handle well — they will mark each other down on the monitor and then tell the monitor "hey, I'm not dead!" and start flapping pretty horrendously. We first ran across it a couple weeks ago and have started to think about it, but I'm not sure a fix for this is going to make it into the initial Bobtail release. :( -Greg -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html