Re: which of cpu frequency and number of threads servers osd better?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



We have 12TB HDD OSDs with 32GiB of (Optane) NVMe for block.db, used
for cephfs_data pools, and NVMe-only OSDs used for cephfs_data pools.
The NVMe DB about doubled our random IO performance - a great
investment - doubling max CPU load as a result. We had to turn up "osd
op num threads per shard hdd" from 1 to 2. (2 is the default for
SSDs). This didn't noticeably improve performance, but without it,
OSDs under max load would sometimes fail to respond to heartbeats. So
with the load that we have - millions of mostly small files on CephFS
- I wouldn't go below 2 real cores per OSD. But this may be a fringe
workload.

On Fri, Nov 13, 2020 at 3:36 AM Frank Schilder <frans@xxxxxx> wrote:
>
> > If each OSD requires 4T
>
> Nobody said that. What was said is HDD=1T,  SSD=3T. It depends on the drive type!
>
> The %-utilisation information is just from top observed during heavy load. It does not show how the kernel schedules things on physical Ts. So, 2x50% utilisation could run on the same HT. I don't know how the OSDs are organised into threads, I'm just stating observations from real life (mimic cluster). So, for an SSD OSD I have seen a maximum of 4 threads in R state, two with 100% and two with 50% CPU, a load that fits on 3HT.
>
> So, real life says 1HT per HDD and 3HT per SSD plus a bit for kernel and networking and you are set - based on worst-case performance monitoring I have seen in 2 years. Note that this is worst-case load. The average load is much lower.
>
> A 16 core machine is totally overpowered. Assuming 1C=2HT, I count (2*3+8*1)/2=7 or (1*3+10*1)/2=6.5. So an 8 core CPU should do in either case. A 10 core CPU might be better, but 16C is a waste of money.
>
> I should mention that these estimates apply to Intel CPUs (x86_64 architectures). Other architectures might not provide the same cycle efficiency.
>
> Best regards,
> =================
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
>
> ________________________________________
> From: Tony Liu <tonyliu0592@xxxxxxxxxxx>
> Sent: 13 November 2020 08:32:55
> To: Frank Schilder; Nathan Fish
> Cc: ceph-users@xxxxxxx
> Subject: RE:  Re: which of cpu frequency and number of threads servers osd better?
>
> You all mentioned first 2T and another 2T. Could you give more
> details how OSD works with multi-thread, or share the link if
> it's already documented somewhere?
>
> Is it always 4T, or start with 1T and grow up to 4T? Is it max 4T?
> Does each T run different job or just multiple instances of the
> same job? Does disk type affect how T works, like 1T is good enough
> for HDD while 4T is required for SSD?
>
> If I change my plan to 2 SSD OSDs and 8 HDD OSDs (with 1 SSD for
> WAL and DB). If each OSD requires 4T, then 16C/32T 3.0GHz could
> be a better choice, because it provides sufficient Ts?
> If SSD OSD requires 4T and HDD OSD only requires 1T, then 8C/16T
> 3.2GHz would be better, because it provides sufficient Ts as well
> as stronger computing?
>
> Thanks!
> Tony
> > -----Original Message-----
> > From: Frank Schilder <frans@xxxxxx>
> > Sent: Thursday, November 12, 2020 10:59 PM
> > To: Tony Liu <tonyliu0592@xxxxxxxxxxx>; Nathan Fish <lordcirth@xxxxxxxxx>
> > Cc: ceph-users@xxxxxxx
> > Subject: Re:  Re: which of cpu frequency and number of
> > threads servers osd better?
> >
> > I think this depends on the type of backing disk. We use the following
> > CPUs:
> >
> > Intel(R) Xeon(R) CPU E5-2660 v4 @ 2.00GHz
> > Intel(R) Xeon(R) Gold 5218 CPU @ 2.30GHz
> > Intel(R) Xeon(R) Silver 4216 CPU @ 2.10GHz
> >
> > My experience is, that a HDD OSD hardly gets to 100% of 1 hyper thread
> > load even under heavy recovery/rebalance operations on 8+2 and 6+2 EC
> > pools with compression set to aggressive. The CPU is mostly doing wait-
> > IO, that is, the disk is the real bottle neck, not the processor power.
> > With SSDs I have seen 2HT at 100% and 2 more at 50% each. I guess NVMe
> > might be more demanding.
> >
> > A server with 12 HDD and 1 SSD should be fine with a modern CPU with 8
> > cores. 16 threads sounds like an 8 core CPU. The 2nd generation Intel®
> > Xeon® Silver 4209T with 8 cores should easily handle that (single socket
> > system). We have the 16-core Intel silver in a dual socket system
> > currently connected to 5HDD and 7SSD and I did a rebalance operation
> > yesterday. The CPU user load did not exceed 2%, it can handle OSD
> > processes easily. The server is dimensioned to run up to 12HDD and 14SSD
> > OSDs (Dell R740xd2). As far as I can tell, the CPU configuration is
> > overpowered for that.
> >
> > Just for info, we use ganglia to record node utilisation. I use 1-year
> > records and pick peak loads I observed for dimensioning the CPUs. These
> > records include some very heavy recovery periods.
> >
> > Best regards,
> > =================
> > Frank Schilder
> > AIT Risø Campus
> > Bygning 109, rum S14
> >
> > ________________________________________
> > From: Tony Liu <tonyliu0592@xxxxxxxxxxx>
> > Sent: 13 November 2020 04:57:53
> > To: Nathan Fish
> > Cc: ceph-users@xxxxxxx
> > Subject:  Re: which of cpu frequency and number of threads
> > servers osd better?
> >
> > Thanks Nathan!
> > Tony
> > > -----Original Message-----
> > > From: Nathan Fish <lordcirth@xxxxxxxxx>
> > > Sent: Thursday, November 12, 2020 7:43 PM
> > > To: Tony Liu <tonyliu0592@xxxxxxxxxxx>
> > > Cc: ceph-users@xxxxxxx
> > > Subject: Re:  which of cpu frequency and number of threads
> > > servers osd better?
> > >
> > > From what I've seen, OSD daemons tend to bottleneck on the first 2
> > > threads, while getting some use out of another 2. So 32 threads at 3.0
> > > would be a lot better. Note that you may get better performance
> > > splitting off some of that SSD for block.db partitions or at least
> > > block.wal for the HDDs.
> > >
> > > On Thu, Nov 12, 2020 at 9:57 PM Tony Liu <tonyliu0592@xxxxxxxxxxx>
> > wrote:
> > > >
> > > > Hi,
> > > >
> > > > For example, 16 threads with 3.2GHz and 32 threads with 3.0GHz,
> > > > which makes 11 OSDs (10x12TB HDD and 1x960GB SSD) with better
> > performance?
> > > >
> > > >
> > > > Thanks!
> > > > Tony
> > > > _______________________________________________
> > > > ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an
> > > > email to ceph-users-leave@xxxxxxx
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an
> > email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux