Re: Best Network Switches for Redundancy

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

On Wed, 1 Jun 2016 11:03:16 +0200 David Riedl wrote:

> 
> > So 3 servers are the entirety of your Ceph storage nodes, right?
> Exactly. + 3 Openstack Compute Nodes
> 
> 
> > Have you been able to determine what causes the drops?
> > My first guess would be that this bonding is simply not compatible with
> > what the switches can do/expect.
> >   
> Yeah, something like that. load balancing round robin kinda works, but 
> it's a 'server side' bonding protocol. The switches don't know anything 
> about that particular configuration.
> > LACP isn't round-robin, but it does distribute things in fashion and
> > given the fact that it actually works you should try it.
> >
> > To be more specific, LACP distribution is based on "sessions", so if
> > you have enough variety in there you will get something that's good
> > enough. A single session however will not be faster than an individual
> > link, IIRC.
> >
> What do you mean by 'variety'? Do you mean I/O?
> 
Variety as in what those sessions (hashes) are based on. 
Usually IP addresses.
So if you were to send just data over one specific TCP session from
10.0.0.1 to 10.0.0.2 it would go over one of your interfaces only, not
both.

So on some my servers with LACP I see noticeable differences between
interface usage, especially when sending traffic is just to one other
node usually. 
On others with a sufficiently large number of connections to various
hosts, it approaches uniform utilization. 

> >
> >
> > Why a single switch and thus a SPoF?
> > Or are you planning to get 2 switches and plan for more clients and
> > Ceph nodes down the road?
> Sorry I wasn't more clear. Yes, 2 48 port switches. And yes, I am 
> planning to add more Ceph nodes. The backend network also runs on only 
> one failover Gigabit interface right now and I'm planning to utilize the 
> 2 remaining interfaces as well.
>
Then mLAG, mc-lag, vlag, clag is for you.

Also consider a flat network consisting of 4 mc-LAGed interfaces instead of
a private cluster and client network.
At 4Gb/s total your local storage is still going to be most likely faster
than you network bandwidth.

> >
> > If I were in your shoes, I'd look at 2 switches running MC-LAG (in any
> > of the happy variations there are)
> > https://en.wikipedia.org/wiki/MC-LAG
> >
> > And since you're on a budget, something like the Cumulus based
> > offerings (Penguin computing, etc).
> Thanks, I'll look into it. Never heard of that protocol before.
> 
It's just LACP over multiple switches, giving you full redundancy AND
bandwidth.

Christian
-- 
Christian Balzer        Network/Systems Engineer                
chibi@xxxxxxx   	Global OnLine Japan/Rakuten Communications
http://www.gol.com/
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux