Hi Stefan - Thanks for the response, here are my answers,
Regards
Radha Krishnan S
TCS Enterprise Cloud Practice
Tata Consultancy Services
Cell:- +1 848 466 4870
Mailto: radhakrishnan2.s@xxxxxxx
Website: http://www.tcs.com
____________________________________________
Experience certainty. IT Services
Business Solutions
Consulting
____________________________________________
-----"Stefan Kooman" <stefan@xxxxxx> wrote: -----
To: "Radhakrishnan2 S" <radhakrishnan2.s@xxxxxxx>
From: "Stefan Kooman" <stefan@xxxxxx>
Date: 12/27/2019 02:28PM
Cc: ceph-users@xxxxxxxxxxxxxx, "ceph-users" <ceph-users-bounces@xxxxxxxxxxxxxx>
Subject: Re: Architecture - Recommendations
From: "Stefan Kooman" <stefan@xxxxxx>
Date: 12/27/2019 02:28PM
Cc: ceph-users@xxxxxxxxxxxxxx, "ceph-users" <ceph-users-bounces@xxxxxxxxxxxxxx>
Subject: Re: Architecture - Recommendations
"External email. Open with Caution"
Quoting Radhakrishnan2 S (radhakrishnan2.s@xxxxxxx):
> Hello Everyone,
>
> We have a pre-prod Ceph cluster and working towards a production cluster
> deployment. I have the following queries and request all your expert tips,
>
>
> 1. Network architecture - We are looking for a private and public network,
> plan is to have L2 at both the networks. I understand that Object / S3
> needs L3 for tenants / users to access outside the network / overlay. What
> would be your recommendations to avoid any network related latencies, like
> should we have a tiered network ? We are intending to go with the standard
> Spine leaf model, with dedicated TOR for Storage and dedicated leafs for
> Clients/ Hypervisors / Compute nodes.
leaf-spine is fine. Are you planning on a big setup? How many nodes?
Quoting Radhakrishnan2 S (radhakrishnan2.s@xxxxxxx):
> Hello Everyone,
>
> We have a pre-prod Ceph cluster and working towards a production cluster
> deployment. I have the following queries and request all your expert tips,
>
>
> 1. Network architecture - We are looking for a private and public network,
> plan is to have L2 at both the networks. I understand that Object / S3
> needs L3 for tenants / users to access outside the network / overlay. What
> would be your recommendations to avoid any network related latencies, like
> should we have a tiered network ? We are intending to go with the standard
> Spine leaf model, with dedicated TOR for Storage and dedicated leafs for
> Clients/ Hypervisors / Compute nodes.
leaf-spine is fine. Are you planning on a big setup? How many nodes?
Radha: Ceph cluster to begin with would be 20 OSD's per Availability zone to begin with.
Leaf-spine can scale well so this shouldn't be a problem. Network
latency won't be the bottleneck of your Ceph cluster, Ceph will be.
I would advise against a public / private network. It makes things more
complicated than needed (and some issues can be hard to debug when
network is partially up).
latency won't be the bottleneck of your Ceph cluster, Ceph will be.
I would advise against a public / private network. It makes things more
complicated than needed (and some issues can be hard to debug when
network is partially up).
Radha: All community recommendations are to seggregate the network, any specific issues that you have experienced in separating networks ? If we have to make it one common network, isnt there any performance issues ?
> 2. Node Design - We are planning to host nodes with mixed set of drives
> like NVMe, SSD and NL-SAS all in one node in a specific ratio. This is
> only to avoid any choking of CPU due to the high performance nodes. Please
> suggest your opinion.
Don't mix if you don't need to. You can optimize hardware according to
your needs: Less heavy CPU for spinners, beefier CPU for NVME. Why would
you want to put it all in one box? If you are planning to use "NVMe" ..
why bother with SSD? NVMe drives are sometimes even cheaper than SSD nowadays.
You might use NVMe for a couple of spinners to put their WAL / DB on.
It's generally better to have more smaller nodes than a few big nodes.
Ideally you don't want to lose more than 10% of your cluster when a node
goes down (12 nodes and up, more is better).
> 2. Node Design - We are planning to host nodes with mixed set of drives
> like NVMe, SSD and NL-SAS all in one node in a specific ratio. This is
> only to avoid any choking of CPU due to the high performance nodes. Please
> suggest your opinion.
Don't mix if you don't need to. You can optimize hardware according to
your needs: Less heavy CPU for spinners, beefier CPU for NVME. Why would
you want to put it all in one box? If you are planning to use "NVMe" ..
why bother with SSD? NVMe drives are sometimes even cheaper than SSD nowadays.
You might use NVMe for a couple of spinners to put their WAL / DB on.
It's generally better to have more smaller nodes than a few big nodes.
Ideally you don't want to lose more than 10% of your cluster when a node
goes down (12 nodes and up, more is better).
Radha: We are a CSP and want to offer three different block tiers through rbd presented via Cinder. As you know, application / business will need to be catered with options to accommodate various types of workloads.
> 3, S3 Traffic - What is the secured way to provide object storage in a
> multi tenant environment since LB/ RGW-HA'd, is going to be in an underlay
> that cant be exposed to clients/ users in the tenant network. Is there a
> way to add an external IP as VIP to LB/RGW that could be commonly used by
> all tenants ?
Underlay / overlay ... are you going to use BGP EVPN (over VXLAN)? In
that case you would have the ceph nodes in the overlay ... You can put a
LB / Proxy up front (Varnish, ha-proxy, nginx, relayd, etc.)... (outside
of Ceph network) and connect over HTTP to the RGW nodes ... wich can
reach the Ceph network (or are even part of it) on the backend.
> 3, S3 Traffic - What is the secured way to provide object storage in a
> multi tenant environment since LB/ RGW-HA'd, is going to be in an underlay
> that cant be exposed to clients/ users in the tenant network. Is there a
> way to add an external IP as VIP to LB/RGW that could be commonly used by
> all tenants ?
Underlay / overlay ... are you going to use BGP EVPN (over VXLAN)? In
that case you would have the ceph nodes in the overlay ... You can put a
LB / Proxy up front (Varnish, ha-proxy, nginx, relayd, etc.)... (outside
of Ceph network) and connect over HTTP to the RGW nodes ... wich can
reach the Ceph network (or are even part of it) on the backend.
Radha: I'm sure we are using BGP EVPN over VXLAN, but all deployments are through the infrastructure management network. We are a CSP and overlay means tenant network, if ceph nodes are in overlay, then multiple tenants will need to be able to communicate to the ceph nodes. If LB is out of the ceph network, lets say XaaS, will routing across networks not create a bottleneck ? I'm novice in network, so if you can help with a reference architecture, it would be of help.
Gr. Stefan
--
| BIT BV https://www.bit.nl/ Kamer van Koophandel 09090351
| GPG: 0xD14839C6 +31 318 648 688 / info@xxxxxx
Gr. Stefan
--
| BIT BV https://www.bit.nl/ Kamer van Koophandel 09090351
| GPG: 0xD14839C6 +31 318 648 688 / info@xxxxxx
=====-----=====-----=====
Notice: The information contained in this e-mail
message and/or attachments to it may contain
confidential or privileged information. If you are
not the intended recipient, any dissemination, use,
review, distribution, printing or copying of the
information contained in this e-mail message
and/or attachments to it are strictly prohibited. If
you have received this communication in error,
please notify us by reply e-mail or telephone and
immediately and permanently delete the message
and any attachments. Thank you
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com