Re: Question about speeding hdd based cluster

<christopher.colvin@xxxxxxxxxxxxxx> · Fri, 15 Nov 2024 11:10:28 -0600

Thank you for that email, and the rest of the chain.    I have a sort of similar setup and the suggestions seem to have improved performance already.   My first instinct was to let Ceph just use the aggregated IO of all of my motley mix of drives.  However, I now have a bunch of found SSDs in the mix moving the index and meta data onto an SSD-only pool gains me more than what I lost moving most of the other random IO off of them.   I also created a fast-HDD and slow-HDD pools since I have mostly NetApp 600GB 10Ks and a few random 15Ks from old Dells but also a lot of 4TB 7.2Ks with a few random 5.4s off the recycle pile.     The NAS-like data goes on fast disks, the dev/test/backup images on slow.    Maybe I’ll merge them back at some point, but the suggestions seems to be helping for now.    

Thanks!

Chris

From: quaglio@xxxxxxxxxx <quaglio@xxxxxxxxxx> 
Sent: Wednesday, October 2, 2024 1:19 PM
To: ceph-users@xxxxxxx; george.kyriazis@xxxxxxxxx
Subject:  Re: Question about speeding hdd based cluster

Hi Kyriazis,
     I work with a cluster similar to yours : 142 HDDs and 18 SSDs.
     I had a lot of performance gains when I made the following settings:

1-) For the pool that is configured on the HDDs (here, home directories are on HDDs), reduce the following replica settings (I don't know what your resilience requirement is):
*size=2
* min_size=1

      I do this for at least 4 years with no problems (even when there is a need to change discs or reboot a server, this config never got me in trouble).

2-) Move the filesystem metadata pools to use at least SSD only.

3-) Increase server and client cache.
Here I left it like this:
osd_memory_target_autotune=true (each OSD always has more than 12G).

For clients:
client_cache_size=163840                                                                                       
client_oc_max_dirty=1048576000                                                                                   
client_oc_max_dirty_age=50
client_oc_max_objects=10000                                                                                        
client_oc_size=2097152000                                                                                   
client_oc_target_dirty=838860800

     Evaluate, following the documentation, which of these variables makes sense for your cluster.

     For the backup scenario, I imagine that decreasing the size and min_size values will change the impact. However, you must evaluate your needs for these settings.

Rafael.

  _____  

De: "Kyriazis, George" <george.kyriazis@xxxxxxxxx <mailto:george.kyriazis@xxxxxxxxx> >
Enviada: 2024/10/02 13:06:09
Para: eblock@xxxxxx <mailto:eblock@xxxxxx> , ceph-users@xxxxxxx <mailto:ceph-users@xxxxxxx> 
Assunto:  Re: Question about speeding hdd based cluster

Thank you all.

The cluster is used mostly for backup of large files currently, but we are hoping to use it for home directories (compiles, etc.) soon. Most usage would be for large files, though.

What I've observed with its current usage is that ceph rebalances, and proxmox-initiated VM backups bring the storage to its knees.

Would a safe approach be to move the metadata pool to ssd first, see how it goes (since it would be cheaper), and then add DB/WAL disks? How would ceph behave if we are adding DB/WAL disks "slowly" (ie one node at a time)? We have about 100 OSDs (mix hdd/ssd) spread across about 25 hosts. Hosts are server-grade with plenty of memory and processing power.

Thank you!

George

> -----Original Message-----
> From: Eugen Block <eblock@xxxxxx <mailto:eblock@xxxxxx> >
> Sent: Wednesday, October 2, 2024 2:18 AM
> To: ceph-users@xxxxxxx <mailto:ceph-users@xxxxxxx> 
> Subject:  Re: Question about speeding hdd based cluster
>
> Hi George,
>
> the docs [0] strongly recommend to have dedicated SSD or NVMe OSDs for
> the metadata pool. You'll also benefit from dedicated DB/WAL devices.
> But as Joachim already stated, it depends on a couple of factors like the
> number of clients, the load they produce, file sizes etc. There's no easy answer.
>
> Regards,
> Eugen
>
> [0] https://docs.ceph.com/en/latest/cephfs/createfs/#creating-pools
>
> Zitat von Joachim Kraftmayer <joachim.kraftmayer@xxxxxxxxx <mailto:joachim.kraftmayer@xxxxxxxxx> >:
>
> > Hi Kyriazis,
> >
> > depends on the workload.
> > I would recommend to add ssd/nvme DB/WAL to each osd.
> >
> >
> >
> > Joachim Kraftmayer
> >
> > www.clyso.com <http://www.clyso.com/> 
> >
> > Hohenzollernstr. 27, 80801 Munich
> >
> > Utting a. A. | HR: Augsburg | HRB: 25866 | USt. ID-Nr.: DE2754306
> >
> > Kyriazis, George <george.kyriazis@xxxxxxxxx <mailto:george.kyriazis@xxxxxxxxx> > schrieb am Mi., 2. Okt.
> > 2024,
> > 07:37:
> >
> >> Hello ceph-users,
> >>
> >> I’ve been wondering…. I have a proxmox hdd-based cephfs pool with no
> >> DB/WAL drives. I also have ssd drives in this setup used for other pools.
> >>
> >> What would increase the speed of the hdd-based cephfs more, and in
> >> what usage scenarios:
> >>
> >> 1. Adding ssd/nvme DB/WAL drives for each node 2. Moving the metadata
> >> pool for my cephfs to ssd 3. Increasing the performance of the
> >> network. I currently have 10gbe links.
> >>
> >> It doesn’t look like the network is currently saturated, so I’m
> >> thinking
> >> (3) is not a solution. However, if I choose any of the other
> >> options, would I need to also upgrade the network so that the network
> >> does not become a bottleneck?
> >>
> >> Thank you!
> >>
> >> George
> >>
> >> _______________________________________________
> >> ceph-users mailing list -- ceph-users@xxxxxxx <mailto:ceph-users@xxxxxxx>  To unsubscribe send an
> >> email to ceph-users-leave@xxxxxxx <mailto:ceph-users-leave@xxxxxxx> 
> >>
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx <mailto:ceph-users@xxxxxxx>  To unsubscribe send an
> > email to ceph-users-leave@xxxxxxx <mailto:ceph-users-leave@xxxxxxx> 
>
>
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx <mailto:ceph-users@xxxxxxx>  To unsubscribe send an email to
> ceph-users-leave@xxxxxxx <mailto:ceph-users-leave@xxxxxxx> 
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx <mailto:ceph-users@xxxxxxx> 
To unsubscribe send an email to ceph-users-leave@xxxxxxx <mailto:ceph-users-leave@xxxxxxx> 

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx