The current network design in Ceph (http://ceph.com/docs/master/rados/configuration/network-config-ref) uses nonredundant networks for both cluster and public communication. Ideally, in a high load environment these will be 10 or 40+ GbE networks. For cost reasons, most such installation will use the same switch hardware and separate Ceph traffic using VLANs. Networking in complex, and situations are possible when switches and routers drop traffic. We ran into one of those at one of our sites - connections to hosts stay up (so bonding NICs does not help), yet OSD communication gets disrupted, client IO hangs and failures cascade to client applications. My understanding is that if OSDs cannot connect for some time over the cluster network, that IO will hang and time out. The document states " If you specify more than one IP address and subnet mask for either the public or the cluster network, the subnets within the network must be capable of routing to each other." Which in real world means complicated Layer 3 setup for routing and is not practical in many configurations. What if there was an option for "cluster 2" and "public 2" networks, to which OSDs and MONs would go either in active/backup or active/active mode (cluster 1 and cluster 2 exist separately do not route to each other)? The difference between this setup and bonding is that here decision to fail over and try the other network is at OSD/MON level, and it bring resilience to faults within the switch core, which is really only detectable at application layer. Am I missing an already existing feature? Please advise. Best regards, Alex Gorbachev Intelligent Systems Services Inc. _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com