Re: Latency for the Public Network

Christian Balzer <chibi@xxxxxxx> · Tue, 6 Feb 2018 18:02:04 +0900

Hello,

On Tue, 6 Feb 2018 09:21:22 +0100 Tobias Kropf wrote:

> On 02/06/2018 04:03 AM, Christian Balzer wrote:
> > Hello,
> >
> > On Mon, 5 Feb 2018 22:04:00 +0100 Tobias Kropf wrote:
> >  
> >> Hi ceph list,
> >>
> >> we have a hyperconvergent ceph cluster with kvm on 8 nodes with ceph
> >> hammer 0.94.10.   
> > Do I smell Proxmox?  
> Yes we use atm Proxmox
> >  
> >> The cluster is now 3 years old an we plan with a new
> >> cluster for a high iops project. We use replicated pools 3/2 and have
> >> not the best latency on our switch backend.
> >>
> >>
> >> ping -s 8192 10.10.10.40 
> >>
> >> 8200 bytes from 10.10.10.40: icmp_seq=1 ttl=64 time=0.153 ms
> >>  
> > Not particularly great, yes.
> > However your network latency is only one factor, Ceph OSDs add quite
> > another layer there and do affect IOPS even more usually. 
> > For high IOPS you need of course fast storage, network AND CPUs.   
> Yes we know that... the network is our first job. We plan with new
> hardware for mon and osd services with a lot of flash nvme disks and
> high ghz cpus.
> >  
> >> We plan to split the hyperconvergent setup to storage an compute nodes
> >> and want to split ceph cluster and public network. Cluster network with
> >> 40 gbit mellanox switches and public network with the existant 10gbit
> >> switches.
> >>  
> > You'd do a lot better if you were to go all 40Gb/s and forget about
> > splitting networks.   
> Use public and cluster network over the same nics and the same subnet?

Yes, at least for NICs. 
If for some reason your compute nodes have no dedicated links/NICs for the
Ceph cluster and it makes you feel warm and fuzzy, you can segregate
traffic with VLANs. 
But it most cases that really comes down to "security theater", if a
compute gets compromised they have access to your ceph cluster network
anyway.

When looking at the ML archives you'll find a number of people suggesting
to keep things simple if not otherwise needed. 

> >
> > The faster replication network will:
> > a) be underutilized all of the time in terms of bandwidth 
> > b) not help with read IOPS at all
> > c) still be hobbled by the public network latency when it comes to write
> > IOPS (but of course help in regards to replication latency). 
> >  
> >> Now my question... are 0.153ms - 0.170ms fast enough for the public
> >> network? We must deploy a setup with 1500 - 2000 terminalserver....
> >>  
> > Define terminal server, are we talking Windows Virtual Desktops with RDP?
> > Windows is quite the hog when it comes to I/O.  
> Yes we talking about windows virtual desktops with rdp....
> Our calculation is... 1x dc= 60-80 IOPS 1x ts = 60-80 IOPS N User * 10
> IOPS ...
> 
> For this system we want to wort with cache tiering in front with nvme
> disk and sata disk on ec pool.  Is this a good idear to use Cache
> tiering in this setup?
> 
Depends on the size of your cache-tier really.
I have done no analysis of Windows I/O behavior other than it being
insanely swap happy w/o needs, so if you can, eliminate the pagefile. 

If all your typical writes can be satisfied from the cache-tier, good.
Reads (like OS boot, etc) should be fine from the EC pool, so cache-tier
in read-forward mode. 

But you _really_ need to test this, a non-fitting cache-tier can be worse
than no cache at all.

Christian

> 
> >
> > Regards,
> >
> > Christian  
> 

-- 
Christian Balzer        Network/Systems Engineer                
chibi@xxxxxxx   	Rakuten Communications
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com