Re: OSPF to the host

Bastian Rosner <bro@xxxxxxxx> · Wed, 08 Jun 2016 11:16:29 +0200

Hi,

regarding clustered VyOS on KVM: In theory this sounds like a safe plan, 
but will come with a great performance penalty because of all the 
context-switches. And even with PCI-passthrough you will also feel 
increased latency.

Docker/LXC/LXD on the other hand does not share the context-switch 
dilemma. Not sure if VyOS likes to run in a docker container though.

I didn't have a chance to play with VPP[1] yet, but it sounds like this 
could be quite useful for high performance routing/switch inside a 
container.

[1]: https://wiki.fd.io/view/VPP

Cheers, Bastian

Am 2016-06-08 09:04, schrieb Josef Johansson:
Hi,

Regarding single points of failure on the daemon on the host I was 
thinking
about doing a cluster setup with i.e. VyOS on kvm-machines on the host, 
and
they handle all the ospf stuff as well. I have not done any performance
benchmarks but it should be possible to do at least. Maybe even 
possible to
do in docker or straight in lxc since it's mostly route management in 
the
kernel.

Regards,
Josef

On Mon, 6 Jun 2016, 18:54 Jeremy Hanmer, <jeremy.hanmer@xxxxxxxxxxxxx>
wrote:

We do the same thing. OSPF between ToR switches, BGP to all of the 
hosts
with each one advertising its own /32 (each has 2 NICs).

On Mon, Jun 6, 2016 at 6:29 AM, Luis Periquito <periquito@xxxxxxxxx>
wrote:

Nick,

TL;DR: works brilliantly :)

Where I work we have all of the ceph nodes (and a lot of other stuff)
using OSPF and BGP server attachment. With that we're able to 
implement
solutions like Anycast addresses, removing the need to add load 
balancers,
for the radosgw solution.

The biggest issues we've had were around the per-flow vs per-packets
traffic load balancing, but as long as you keep it simple you 
shouldn't
have any issues.

Currently we have a P2P network between the servers and the ToR 
switches
on a /31 subnet, and then create a virtual loopback address, which is 
the
interface we use for all communications. Running tests like iperf 
we're
able to reach 19Gbps (on a 2x10Gbps network). OTOH we no longer have 
the
ability to separate traffic between public and osd network, but never
really felt the need for it.

Also spend a bit of time planning how the network will look like and 
it's
topology. If done properly (think details like route summarization) 
then
it's really worth the extra effort.

On Mon, Jun 6, 2016 at 11:57 AM, Nick Fisk <nick@xxxxxxxxxx> wrote:

Hi All,

Has anybody had any experience with running the network routed down 
all
the way to the host?

I know the standard way most people configured their OSD nodes is to
bond the two nics which will then talk via a VRRP gateway and then 
probably
from then on the networking is all Layer3. The main disadvantage I 
see here
is that you need a beefy inter switch link to cope with the amount 
of
traffic flowing between switches to the VRRP address. I’ve been 
trying to
design around this by splitting hosts into groups with different 
VRRP
gateways on either switch, but this relies on using active/passive 
bonding
on the OSD hosts to make sure traffic goes from the correct Nic to 
the
directly connected switch.

What I was thinking, instead of terminating the Layer3 part of the
network at the access switches, terminate it at the hosts. If each 
Nic of
the OSD host had a different subnet and the actual “OSD Server” 
address
bound to a loopback adapter, OSPF should advertise this loopback 
adapter
address as reachable via the two L3 links on the physically attached 
Nic’s.
This should give you a redundant topology which also will respect 
your
physically layout and potentially give you higher performance due to 
ECMP.

Any thoughts, any pitfalls?

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com