Re: OSD public / cluster network isolation using VRF:s

Martin Millnert <martin@xxxxxxxxxxx> · Mon, 07 Dec 2015 14:41:03 +0100

Sage,

thanks for your feedback, please see below:

On Thu, 2015-12-03 at 13:30 -0800, Sage Weil wrote:
> On Thu, 3 Dec 2015, wido@xxxxxxxx wrote:
> > Why all the trouble and complexity? I personally always try to avoid the 
> > two networks and run with one. Also in large L3 envs.
> > 
> > I like the idea that one machine has one IP I have to monitor.
> > 
> > I would rethink about what a cluster network really adds. Imho it only 
> > adds complexity.
> 
> FWIW I tend to agree.  There are probably some network deployments where 
> it makes sense, but for most people I think it just adds complexity.  
> Maybe it makes it easy to utilize dual interfaces, but my guess is you're 
> better off bonding them if you can.

I'll add to the response to Wido that in our case it's not separated
physical interfaces. (While providing good QoS, you would lose statmux
gains and redundancy). ToR <-> host in our case is bonded or
equivalent. 

> Note that on a largish cluster the public/client traffic is all 
> north-south, while the backend traffic is also mostly north-south to the 
> top-of-rack and then east-west.  I.e., within the rack, almost everything 
> is north-south, and client and replication traffic don't look that 
> different.

This problem domain is one of the larger challenges. I worry about
network timeouts for critical cluster traffic in one of the clusters due
to hosts having 2x1GbE. I.e. in our case I want to
prioritize/guarantee/reserve a minimum amount of bandwidth for cluster
health traffic primarily, and secondarily cluster replication. Client
write replication should then be least prioritized.

To support this I need our network equipment to perform the CoS job, and
in order to do that at some level in the stack I need to be able to
classify traffic. And furthermore, I'd like to do this with as little
added state as possible.

/Martin

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html