Re: OSD public / cluster network isolation using VRF:s

Kyle Bader <kyle.bader@xxxxxxxxx> · Mon, 14 Dec 2015 10:31:34 -0800

On Mon, Dec 7, 2015 at 6:10 AM, Sage Weil <sage@xxxxxxxxxxxx> wrote:
> On Mon, 7 Dec 2015, Martin Millnert wrote:
>> > Note that on a largish cluster the public/client traffic is all
>> > north-south, while the backend traffic is also mostly north-south to the
>> > top-of-rack and then east-west.  I.e., within the rack, almost everything
>> > is north-south, and client and replication traffic don't look that
>> > different.
>>
>> This problem domain is one of the larger challenges. I worry about
>> network timeouts for critical cluster traffic in one of the clusters due
>> to hosts having 2x1GbE. I.e. in our case I want to
>> prioritize/guarantee/reserve a minimum amount of bandwidth for cluster
>> health traffic primarily, and secondarily cluster replication. Client
>> write replication should then be least prioritized.
>
> One word of caution here: the health traffic should really be the
> same path and class of service as the inter-osd traffic, or else it
> will not identify failures.  e.g., if the health traffic is prioritized,
> and lower-priority traffic is starved/dropped, we won't notice.
>
>> To support this I need our network equipment to perform the CoS job, and
>> in order to do that at some level in the stack I need to be able to
>> classify traffic. And furthermore, I'd like to do this with as little
>> added state as possible.
>
> I seem to recall a conversation a year or so ago about tagging
> stream/sockets so that the network layer could do this.  I don't think
> we got anywhere, though...

We talked about it, I think this was the resulting issue that was opened:

http://tracker.ceph.com/issues/12260

-- 

Kyle Bader
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html