Re: Question about speeding hdd based cluster

"Anthony D'Atri" <aad@xxxxxxxxxxxxxx> · Wed, 2 Oct 2024 17:27:47 -0400

>> It is nonetheless risky.  The wrong sequence of cascading events, of overlapping failures and you may lose data.  
> 
> Our setup is with 3/2.  size=3 seems much safer than 2.

Indeed that is the default for replicated pools.  Additional replicas exhibit diminishing returns in most cases at high cost.

> 
>>> 2-) Move the filesystem metadata pools to use at least SSD only.
>>> 
>> Absolutely.  The CephFS docs suggest using size=4 for the MD pool.
>> 
> 
> Hmm..  I don’t remember reading that anywhere, but it makes sense.  

https://docs.ceph.com/en/quincy/cephfs/createfs/#creating-pools

We recommend configuring at least 3 replicas for the metadata pool, as data loss in this pool can render the entire file system inaccessible. Configuring 4 would not be extreme, especially since the metadata pool’s capacity requirements are quite modest.

> 
> Thanks!
> 
> George
> 
> 
>> 
>>> 
>>> 3-) Increase server and client cache.
>>> Here I left it like this:
>>> osd_memory_target_autotune=true (each OSD always has more than 12G).
>>> 
>>> For clients:
>>> client_cache_size=163840                                                                                       
>>> client_oc_max_dirty=1048576000                                                                                   
>>> client_oc_max_dirty_age=50
>>> client_oc_max_objects=10000                                                                                        
>>> client_oc_size=2097152000                                                                                   
>>> client_oc_target_dirty=838860800
>>> 
>>>      Evaluate, following the documentation, which of these variables makes sense for your cluster.
>>> 
>>>      For the backup scenario, I imagine that decreasing the size and min_size values will change the impact. However, you must evaluate your needs for these settings.
>>> 
>>> 
>>> Rafael.
>>> 
>>>  
>>> 
>>> De: "Kyriazis, George" <george.kyriazis@xxxxxxxxx>
>>> Enviada: 2024/10/02 13:06:09
>>> Para: eblock@xxxxxx, ceph-users@xxxxxxx
>>> Assunto:  Re: Question about speeding hdd based cluster
>>>  
>>> Thank you all.
>>> 
>>> The cluster is used mostly for backup of large files currently, but we are hoping to use it for home directories (compiles, etc.) soon. Most usage would be for large files, though.
>>> 
>>> What I've observed with its current usage is that ceph rebalances, and proxmox-initiated VM backups bring the storage to its knees.
>>> 
>>> Would a safe approach be to move the metadata pool to ssd first, see how it goes (since it would be cheaper), and then add DB/WAL disks? How would ceph behave if we are adding DB/WAL disks "slowly" (ie one node at a time)? We have about 100 OSDs (mix hdd/ssd) spread across about 25 hosts. Hosts are server-grade with plenty of memory and processing power.
>>> 
>>> Thank you!
>>> 
>>> George
>>> 
>>> 
>>> > -----Original Message-----
>>> > From: Eugen Block <eblock@xxxxxx>
>>> > Sent: Wednesday, October 2, 2024 2:18 AM
>>> > To: ceph-users@xxxxxxx
>>> > Subject:  Re: Question about speeding hdd based cluster
>>> >
>>> > Hi George,
>>> >
>>> > the docs [0] strongly recommend to have dedicated SSD or NVMe OSDs for
>>> > the metadata pool. You'll also benefit from dedicated DB/WAL devices.
>>> > But as Joachim already stated, it depends on a couple of factors like the
>>> > number of clients, the load they produce, file sizes etc. There's no easy answer.
>>> >
>>> > Regards,
>>> > Eugen
>>> >
>>> > [0] https://docs.ceph.com/en/latest/cephfs/createfs/#creating-pools
>>> >
>>> > Zitat von Joachim Kraftmayer <joachim.kraftmayer@xxxxxxxxx>:
>>> >
>>> > > Hi Kyriazis,
>>> > >
>>> > > depends on the workload.
>>> > > I would recommend to add ssd/nvme DB/WAL to each osd.
>>> > >
>>> > >
>>> > >
>>> > > Joachim Kraftmayer
>>> > >
>>> > > www.clyso.com <http://www.clyso.com/>
>>> > >
>>> > > Hohenzollernstr. 27, 80801 Munich
>>> > >
>>> > > Utting a. A. | HR: Augsburg | HRB: 25866 | USt. ID-Nr.: DE2754306
>>> > >
>>> > > Kyriazis, George <george.kyriazis@xxxxxxxxx> schrieb am Mi., 2. Okt.
>>> > > 2024,
>>> > > 07:37:
>>> > >
>>> > >> Hello ceph-users,
>>> > >>
>>> > >> I’ve been wondering…. I have a proxmox hdd-based cephfs pool with no
>>> > >> DB/WAL drives. I also have ssd drives in this setup used for other pools.
>>> > >>
>>> > >> What would increase the speed of the hdd-based cephfs more, and in
>>> > >> what usage scenarios:
>>> > >>
>>> > >> 1. Adding ssd/nvme DB/WAL drives for each node 2. Moving the metadata
>>> > >> pool for my cephfs to ssd 3. Increasing the performance of the
>>> > >> network. I currently have 10gbe links.
>>> > >>
>>> > >> It doesn’t look like the network is currently saturated, so I’m
>>> > >> thinking
>>> > >> (3) is not a solution. However, if I choose any of the other
>>> > >> options, would I need to also upgrade the network so that the network
>>> > >> does not become a bottleneck?
>>> > >>
>>> > >> Thank you!
>>> > >>
>>> > >> George
>>> > >>
>>> > >> _______________________________________________
>>> > >> ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an
>>> > >> email to ceph-users-leave@xxxxxxx
>>> > >>
>>> > > _______________________________________________
>>> > > ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an
>>> > > email to ceph-users-leave@xxxxxxx
>>> >
>>> >
>>> > _______________________________________________
>>> > ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to
>>> > ceph-users-leave@xxxxxxx
>>> _______________________________________________
>>> ceph-users mailing list -- ceph-users@xxxxxxx
>>> To unsubscribe send an email to ceph-users-leave@ceph.io_______________________________________________
>>> ceph-users mailing list -- ceph-users@xxxxxxx <mailto:ceph-users@xxxxxxx>
>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx <mailto:ceph-users-leave@xxxxxxx>

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx