Re: OSPF to the host

Heath Albritton <halbritt@xxxxxxxx> · Mon, 6 Jun 2016 09:50:38 -0700

Separate OSPF areas would make this unnecessarily complex.  In a world where (some) routers are built to accommodate the number of Internet prefixes of over a half million, your few hundred or few thousand /32s represent very little load to a modern network element.

The number of links will have more impact on performance as each flap will cause an SPF recalc.  Again, this shouldn't be an issue for modern gear, but you'll want to test and tune for convergence time.  I would only really want to consider multiple areas in a single data center if I couldn't get reconvergence down to something acceptable.  Generally, less than 1s and preferably much quicker than that.

-H

> On Jun 6, 2016, at 07:02, Nick Fisk <nick@xxxxxxxxxx> wrote:
> 
> 
> 
>> -----Original Message-----
>> From: Luis Periquito [mailto:periquito@xxxxxxxxx]
>> Sent: 06 June 2016 14:30
>> To: Nick Fisk <nick@xxxxxxxxxx>
>> Cc: Ceph Users <ceph-users@xxxxxxxxxxxxxx>
>> Subject: Re:  OSPF to the host
>> 
>> Nick,
>> 
>> TL;DR: works brilliantly :)
> 
> Excellent, just what I wanted to hear!!!
> 
>> 
>> Where I work we have all of the ceph nodes (and a lot of other stuff) using
>> OSPF and BGP server attachment. With that we're able to implement
>> solutions like Anycast addresses, removing the need to add load balancers,
>> for the radosgw solution.
>> 
>> The biggest issues we've had were around the per-flow vs per-packets traffic
>> load balancing, but as long as you keep it simple you shouldn't have any
>> issues.
>> 
>> Currently we have a P2P network between the servers and the ToR switches
>> on a /31 subnet, and then create a virtual loopback address, which is the
>> interface we use for all communications. Running tests like iperf we're able
>> to reach 19Gbps (on a 2x10Gbps network). OTOH we no longer have the
>> ability to separate traffic between public and osd network, but never really
>> felt the need for it.
> 
> Yeah, I've come to pretty much the same conclusion.
> 
>> 
>> Also spend a bit of time planning how the network will look like and it's
>> topology. If done properly (think details like route summarization) then it's
>> really worth the extra effort.
> 
> How are you doing the route summarization with OSPF? Is each rack (for example) a separate OSPF area, which you then summarize and send up to area 0?
> 
>> 
>> 
>> 
>> On Mon, Jun 6, 2016 at 11:57 AM, Nick Fisk <mailto:nick@xxxxxxxxxx> wrote:
>> Hi All,
>> 
>> Has anybody had any experience with running the network routed down all
>> the way to the host?
>> 
>> I know the standard way most people configured their OSD nodes is to bond
>> the two nics which will then talk via a VRRP gateway and then probably from
>> then on the networking is all Layer3. The main disadvantage I see here is that
>> you need a beefy inter switch link to cope with the amount of traffic flowing
>> between switches to the VRRP address. I’ve been trying to design around
>> this by splitting hosts into groups with different VRRP gateways on either
>> switch, but this relies on using active/passive bonding on the OSD hosts to
>> make sure traffic goes from the correct Nic to the directly connected switch.
>> 
>> What I was thinking, instead of terminating the Layer3 part of the network at
>> the access switches, terminate it at the hosts. If each Nic of the OSD host had
>> a different subnet and the actual “OSD Server” address bound to a loopback
>> adapter, OSPF should advertise this loopback adapter address as reachable
>> via the two L3 links on the physically attached Nic’s. This should give you a
>> redundant topology which also will respect your physically layout and
>> potentially give you higher performance due to ECMP.
>> 
>> Any thoughts, any pitfalls?
>> 
>> 
>> 
>> _______________________________________________
>> ceph-users mailing list
>> mailto:ceph-users@xxxxxxxxxxxxxx
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com