Re: Best Network Switches for Redundancy

David Riedl <david.riedl@xxxxxxxxxxx> · Wed, 1 Jun 2016 11:03:16 +0200

    > So 3 servers are the entirety of your Ceph storage nodes, right?
Exactly. + 3 Openstack Compute Nodes

      Have you been able to determine what causes the drops?
My first guess would be that this bonding is simply not compatible with
what the switches can do/expect. 

    Yeah, something like that. load balancing round robin kinda works,
    but it's a 'server side' bonding protocol. The switches don't know
    anything about that particular configuration.

      LACP isn't round-robin, but it does distribute things in fashion and given
the fact that it actually works you should try it. 

To be more specific, LACP distribution is based on "sessions", so if you
have enough variety in there you will get something that's good enough.
A single session however will not be faster than an individual link, IIRC.

    What do you mean by 'variety'? Do you mean I/O? 

      Why a single switch and thus a SPoF?
Or are you planning to get 2 switches and plan for more clients and Ceph
nodes down the road?

    Sorry I wasn't more clear. Yes, 2 48 port switches. And yes, I am
    planning to add more Ceph nodes. The backend network also runs on
    only one failover Gigabit interface right now and I'm planning to
    utilize the 2 remaining interfaces as well.

If I were in your shoes, I'd look at 2 switches running MC-LAG (in any of
the happy variations there are)
https://en.wikipedia.org/wiki/MC-LAG

And since you're on a budget, something like the Cumulus based offerings
(Penguin computing, etc).

    Thanks, I'll look into it. Never heard of that protocol before.

    Regards
David

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com