Re: Performance impact of Heterogeneous environment

Peter Sabaini <peter@xxxxxxxxxx> · Wed, 17 Jan 2024 17:56:52 +0100

On 17.01.24 11:13, Tino Todino wrote:
> Hi folks.
> 
> I had a quick search but found nothing concrete on this so thought I would ask.
> 
> We currently have a 4 host CEPH cluster with an NVMe pool (1 OSD per host) and an HDD Pool (1 OSD per host).  Both OSD's use a separate NVMe for DB/WAL. These machines are identical (Homogenous) and are Ryzen 7 5800X machines with 64GB DDR3200 RAM.  The NVMe's are 1TB Seagate Ironwolfs and the HDD's are 16TB Seagate IronWolfs.
> 
> We are wanting to add more nodes mainly for capacity and resilience reasons.  We have an old 3 node cluster of Dell R740 servers that could be added to this CEPH cluster.  Instead of DDR4, they use DDR3 (although 1.5TB each!!). and instead of Ryzen 7 5800X CPUs they use  old Intel Xeon CPU E5-4657L v2 (96 cores at 2.4Ghz).
> 
> What would be the performance impact of adding these three nodes with the same OSD layout (i.e 1NVMe OSD and 1 HDD OSD per host with 1x NVMe DB/WAL NVMe)
> Would we get overall better performance or worse?  Can weighting be used to mitigate performance penalties and if so is this easy to configure?

What will happen is that Ceph will distribute PGs across your cluster uniformly by default, so for some requests PGs from the Ryzen nodes will answer, for others the Xeons. Presumably the Xeons will be slower based on slower clock speed. The net effect will be jitter in completion latencies -- it depends on your workloads if they're fine with it or not. Note the latencies are sensitive to clock speeds; the huge amount or RAM on the Xeons won't help much for OSDs, and as your cluster is small MONs/MGRs won't need that much either. Check out https://docs.ceph.com/en/reef/start/hardware-recommendations/ and see the section on memory for tuning recommendations.

You can influence distribution by weighting but this will only influence the rate of jitter not the magnitude (I mean if you weight down the Xeons to 0 the jitter will cease but that's not very useful :-)). 

One thing that I've heard people do but haven't done personally with fast NVMes (not familiar with the IronWolf so not sure if they qualify) is partition them up so that they run more than one OSD (say 2 to 4) on a single NVMe to better utilize the NVMe bandwidth. See https://ceph.com/community/bluestore-default-vs-tuned-performance-comparison/ 

> 
> On performance, I would deem it Ok for our use case currently (VM disks), as we are running on 10Gbe network (with dedicated NICs for public and cluster network).
> 
> Many thanks in advance
> 
> Tino
> This E-mail is intended solely for the person or organisation to which it is addressed. It may contain privileged or confidential information and, if you are not the intended recipient, you must not copy, distribute or take any action in reliance upon it. Any views or opinions presented are solely those of the author and do not necessarily represent those of Marlan Maritime Technologies Ltd. If you have received this E-mail in error, please notify us as soon as possible and delete it from your computer. Marlan Maritime Technologies Ltd Registered in England & Wales 323 Mariners House, Norfolk Street, Liverpool. L1 0BG Company No. 08492427.
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx