Re: New cluster - configuration tips and reccomendation - NVMe

Wido den Hollander <wido@xxxxxxxx> · Wed, 5 Jul 2017 12:51:27 +0200 (CEST)

> Op 5 juli 2017 om 12:39 schreef ceph@xxxxxxxxxxxxxx:
> 
> 
> Beware, a single 10G NIC is easily saturated by a single NVMe device
> 

Yes, it is. But that what was what I'm pointing at. Bandwidth is usually not a problem, latency is.

Take a look at a Ceph cluster running out there, it is probably doing a lot of IOps, but not that much bandwidth.

A production cluster I took a look at:

"client io 405 MB/s rd, 116 MB/s wr, 12211 op/s rd, 13272 op/s wr"

This cluster is 15 machines with 10 OSDs (SSD, PM863a) each.

So 405/15 = 27MB/sec

It's doing 13k IOps now, that increases to 25k during higher load, but the bandwidth stays below 500MB/sec in TOTAL.

So yes, you are right, a NVMe device can sature a single NIC, but most of the time latency and IOps are what count. Not bandwidth.

Wido

> On 05/07/2017 11:54, Wido den Hollander wrote:
> > 
> >> Op 5 juli 2017 om 11:41 schreef "Van Leeuwen, Robert" <rovanleeuwen@xxxxxxxx>:
> >>
> >>
> >> Hi Max,
> >>
> >> You might also want to look at the PCIE lanes.
> >> I am not an expert on the matter but my guess would be the 8 NVME drives + 2x100Gbit would be too much for
> >> the current Xeon generation (40 PCIE lanes) to fully utilize.
> >>
> > 
> > Fair enough, but you might want to think about if you really, really need 100Gbit. Those cards are expensive, same goes for the Gbics and switches.
> > 
> > Storage is usually latency bound and not so much bandwidth. Imho a lot of people focus on raw TBs and bandwidth, but in the end IOps and latency are what usually matters.
> > 
> > I'd probably stick with 2x10Gbit for now and use the money I saved on more memory and faster CPUs.
> > 
> > Wido
> > 
> >> I think the upcoming AMD/Intel offerings will improve that quite a bit so you may want to wait for that.
> >> As mentioned earlier. Single Core cpu speed matters for latency so you probably want to up that.
> >>
> >> You can also look at the DIMM configuration.
> >> TBH I am not sure how much it impacts Ceph performance but having just 2 DIMMS slots populated will not give you max memory bandwidth.
> >> Having some extra memory for read-cache probably won’t hurt either (unless you know your workload won’t include any cacheable reads)
> >>
> >> Cheers,
> >> Robert van Leeuwen
> >>
> >> From: ceph-users <ceph-users-bounces@xxxxxxxxxxxxxx> on behalf of Massimiliano Cuttini <max@xxxxxxxxxxxxx>
> >> Organization: PhoenixWeb Srl
> >> Date: Wednesday, July 5, 2017 at 10:54 AM
> >> To: "ceph-users@xxxxxxxxxxxxxx" <ceph-users@xxxxxxxxxxxxxx>
> >> Subject:  New cluster - configuration tips and reccomendation - NVMe
> >>
> >>
> >> Dear all,
> >>
> >> luminous is coming and sooner we should be allowed to avoid double writing.
> >> This means use 100% of the speed of SSD and NVMe.
> >> Cluster made all of SSD and NVMe will not be penalized and start to make sense.
> >>
> >> Looking forward I'm building the next pool of storage which we'll setup on next term.
> >> We are taking in consideration a pool of 4 with the following single node configuration:
> >>
> >>   *   2x E5-2603 v4 - 6 cores - 1.70GHz
> >>   *   2x 32Gb of RAM
> >>   *   2x NVMe M2 for OS
> >>   *   6x NVMe U2 for OSD
> >>   *   2x 100Gib ethernet cards
> >>
> >> We have yet not sure about which Intel and how much RAM we should put on it to avoid CPU bottleneck.
> >> Can you help me to choose the right couple of CPU?
> >> Did you see any issue on the configuration proposed?
> >>
> >> Thanks,
> >> Max
> >> _______________________________________________
> >> ceph-users mailing list
> >> ceph-users@xxxxxxxxxxxxxx
> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > _______________________________________________
> > ceph-users mailing list
> > ceph-users@xxxxxxxxxxxxxx
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > 
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com