Re: Redundant networks in Ceph

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Alex,

I think the answer is you do 1 of 2 things. You either design your network
so that it is fault tolerant in every way so that network interruption is
not possible. Or go with non-redundant networking, but design your crush map
around the failure domains of the network.

I'm interested in your example of where OSD's where unable to communicate.
What happened? Would it possible to redesign the network to stop this
happening?

Nick

> -----Original Message-----
> From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of
> Alex Gorbachev
> Sent: 27 June 2015 19:02
> To: ceph-users@xxxxxxxxxxxxxx
> Subject:  Redundant networks in Ceph
> 
> The current network design in Ceph
> (http://ceph.com/docs/master/rados/configuration/network-config-ref)
> uses nonredundant networks for both cluster and public communication.
> Ideally, in a high load environment these will be 10 or 40+ GbE networks.
For
> cost reasons, most such installation will use the same switch hardware and
> separate Ceph traffic using VLANs.
> 
> Networking in complex, and situations are possible when switches and
> routers drop traffic.  We ran into one of those at one of our sites -
> connections to hosts stay up (so bonding NICs does not help), yet OSD
> communication gets disrupted, client IO hangs and failures cascade to
client
> applications.
> 
> My understanding is that if OSDs cannot connect for some time over the
> cluster network, that IO will hang and time out.  The document states "
> 
> If you specify more than one IP address and subnet mask for either the
> public or the cluster network, the subnets within the network must be
> capable of routing to each other."
> 
> Which in real world means complicated Layer 3 setup for routing and is not
> practical in many configurations.
> 
> What if there was an option for "cluster 2" and "public 2" networks, to
which
> OSDs and MONs would go either in active/backup or active/active mode
> (cluster 1 and cluster 2 exist separately do not route to each other)?
> 
> The difference between this setup and bonding is that here decision to
fail
> over and try the other network is at OSD/MON level, and it bring
resilience to
> faults within the switch core, which is really only detectable at
application
> layer.
> 
> Am I missing an already existing feature?  Please advise.
> 
> Best regards,
> Alex Gorbachev
> Intelligent Systems Services Inc.
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux