Redundant networks in Ceph

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The current network design in Ceph
(http://ceph.com/docs/master/rados/configuration/network-config-ref)
uses nonredundant networks for both cluster and public communication.
Ideally, in a high load environment these will be 10 or 40+ GbE
networks.  For cost reasons, most such installation will use the same
switch hardware and separate Ceph traffic using VLANs.

Networking in complex, and situations are possible when switches and
routers drop traffic.  We ran into one of those at one of our sites -
connections to hosts stay up (so bonding NICs does not help), yet OSD
communication gets disrupted, client IO hangs and failures cascade to
client applications.

My understanding is that if OSDs cannot connect for some time over the
cluster network, that IO will hang and time out.  The document states
"

If you specify more than one IP address and subnet mask for either the
public or the cluster network, the subnets within the network must be
capable of routing to each other."

Which in real world means complicated Layer 3 setup for routing and is
not practical in many configurations.

What if there was an option for "cluster 2" and "public 2" networks,
to which OSDs and MONs would go either in active/backup or
active/active mode (cluster 1 and cluster 2 exist separately do not
route to each other)?

The difference between this setup and bonding is that here decision to
fail over and try the other network is at OSD/MON level, and it bring
resilience to faults within the switch core, which is really only
detectable at application layer.

Am I missing an already existing feature?  Please advise.

Best regards,
Alex Gorbachev
Intelligent Systems Services Inc.
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux