Re: Question about speeding hdd based cluster

Joachim Kraftmayer <joachim.kraftmayer@xxxxxxxxx> · Sat, 16 Nov 2024 10:56:38 +0100

Short comment on Replikation size 2: is not the question if you will loose
data only when.

Joachim Kraftmayer

joachim.kraftmayer@xxxxxxxxx

www.clyso.com

Hohenzollernstr. 27, 80801 Munich

Utting a. A. | HR: Augsburg | HRB: 25866 | USt. ID-Nr.: DE2754306

<christopher.colvin@xxxxxxxxxxxxxx> schrieb am Sa., 16. Nov. 2024, 05:09:

> Thank you for that email, and the rest of the chain.    I have a sort of
> similar setup and the suggestions seem to have improved performance
> already.   My first instinct was to let Ceph just use the aggregated IO of
> all of my motley mix of drives.  However, I now have a bunch of found SSDs
> in the mix moving the index and meta data onto an SSD-only pool gains me
> more than what I lost moving most of the other random IO off of them.   I
> also created a fast-HDD and slow-HDD pools since I have mostly NetApp 600GB
> 10Ks and a few random 15Ks from old Dells but also a lot of 4TB 7.2Ks with
> a few random 5.4s off the recycle pile.     The NAS-like data goes on fast
> disks, the dev/test/backup images on slow.    Maybe I’ll merge them back at
> some point, but the suggestions seems to be helping for now.
>
>
>
> Thanks!
>
> Chris
>
>
>
> From: quaglio@xxxxxxxxxx <quaglio@xxxxxxxxxx>
> Sent: Wednesday, October 2, 2024 1:19 PM
> To: ceph-users@xxxxxxx; george.kyriazis@xxxxxxxxx
> Subject:  Re: Question about speeding hdd based cluster
>
>
>
>
>
> Hi Kyriazis,
>      I work with a cluster similar to yours : 142 HDDs and 18 SSDs.
>      I had a lot of performance gains when I made the following settings:
>
> 1-) For the pool that is configured on the HDDs (here, home directories
> are on HDDs), reduce the following replica settings (I don't know what your
> resilience requirement is):
> *size=2
> * min_size=1
>
>       I do this for at least 4 years with no problems (even when there is
> a need to change discs or reboot a server, this config never got me in
> trouble).
>
> 2-) Move the filesystem metadata pools to use at least SSD only.
>
> 3-) Increase server and client cache.
> Here I left it like this:
> osd_memory_target_autotune=true (each OSD always has more than 12G).
>
> For clients:
> client_cache_size=163840
>
> client_oc_max_dirty=1048576000
>
> client_oc_max_dirty_age=50
> client_oc_max_objects=10000
>
> client_oc_size=2097152000
>
> client_oc_target_dirty=838860800
>
>      Evaluate, following the documentation, which of these variables makes
> sense for your cluster.
>
>      For the backup scenario, I imagine that decreasing the size and
> min_size values will change the impact. However, you must evaluate your
> needs for these settings.
>
>
> Rafael.
>
>
>
>   _____
>
>
> De: "Kyriazis, George" <george.kyriazis@xxxxxxxxx <mailto:
> george.kyriazis@xxxxxxxxx> >
> Enviada: 2024/10/02 13:06:09
> Para: eblock@xxxxxx <mailto:eblock@xxxxxx> , ceph-users@xxxxxxx <mailto:
> ceph-users@xxxxxxx>
> Assunto:  Re: Question about speeding hdd based cluster
>
>
> Thank you all.
>
> The cluster is used mostly for backup of large files currently, but we are
> hoping to use it for home directories (compiles, etc.) soon. Most usage
> would be for large files, though.
>
> What I've observed with its current usage is that ceph rebalances, and
> proxmox-initiated VM backups bring the storage to its knees.
>
> Would a safe approach be to move the metadata pool to ssd first, see how
> it goes (since it would be cheaper), and then add DB/WAL disks? How would
> ceph behave if we are adding DB/WAL disks "slowly" (ie one node at a time)?
> We have about 100 OSDs (mix hdd/ssd) spread across about 25 hosts. Hosts
> are server-grade with plenty of memory and processing power.
>
> Thank you!
>
> George
>
>
> > -----Original Message-----
> > From: Eugen Block <eblock@xxxxxx <mailto:eblock@xxxxxx> >
> > Sent: Wednesday, October 2, 2024 2:18 AM
> > To: ceph-users@xxxxxxx <mailto:ceph-users@xxxxxxx>
> > Subject:  Re: Question about speeding hdd based cluster
> >
> > Hi George,
> >
> > the docs [0] strongly recommend to have dedicated SSD or NVMe OSDs for
> > the metadata pool. You'll also benefit from dedicated DB/WAL devices.
> > But as Joachim already stated, it depends on a couple of factors like the
> > number of clients, the load they produce, file sizes etc. There's no
> easy answer.
> >
> > Regards,
> > Eugen
> >
> > [0] https://docs.ceph.com/en/latest/cephfs/createfs/#creating-pools
> >
> > Zitat von Joachim Kraftmayer <joachim.kraftmayer@xxxxxxxxx <mailto:
> joachim.kraftmayer@xxxxxxxxx> >:
> >
> > > Hi Kyriazis,
> > >
> > > depends on the workload.
> > > I would recommend to add ssd/nvme DB/WAL to each osd.
> > >
> > >
> > >
> > > Joachim Kraftmayer
> > >
> > > www.clyso.com <http://www.clyso.com/>
> > >
> > > Hohenzollernstr. 27, 80801 Munich
> > >
> > > Utting a. A. | HR: Augsburg | HRB: 25866 | USt. ID-Nr.: DE2754306
> > >
> > > Kyriazis, George <george.kyriazis@xxxxxxxxx <mailto:
> george.kyriazis@xxxxxxxxx> > schrieb am Mi., 2. Okt.
> > > 2024,
> > > 07:37:
> > >
> > >> Hello ceph-users,
> > >>
> > >> I’ve been wondering…. I have a proxmox hdd-based cephfs pool with no
> > >> DB/WAL drives. I also have ssd drives in this setup used for other
> pools.
> > >>
> > >> What would increase the speed of the hdd-based cephfs more, and in
> > >> what usage scenarios:
> > >>
> > >> 1. Adding ssd/nvme DB/WAL drives for each node 2. Moving the metadata
> > >> pool for my cephfs to ssd 3. Increasing the performance of the
> > >> network. I currently have 10gbe links.
> > >>
> > >> It doesn’t look like the network is currently saturated, so I’m
> > >> thinking
> > >> (3) is not a solution. However, if I choose any of the other
> > >> options, would I need to also upgrade the network so that the network
> > >> does not become a bottleneck?
> > >>
> > >> Thank you!
> > >>
> > >> George
> > >>
> > >> _______________________________________________
> > >> ceph-users mailing list -- ceph-users@xxxxxxx <mailto:
> ceph-users@xxxxxxx>  To unsubscribe send an
> > >> email to ceph-users-leave@xxxxxxx <mailto:ceph-users-leave@xxxxxxx>
> > >>
> > > _______________________________________________
> > > ceph-users mailing list -- ceph-users@xxxxxxx <mailto:
> ceph-users@xxxxxxx>  To unsubscribe send an
> > > email to ceph-users-leave@xxxxxxx <mailto:ceph-users-leave@xxxxxxx>
> >
> >
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx <mailto:ceph-users@xxxxxxx>
> To unsubscribe send an email to
> > ceph-users-leave@xxxxxxx <mailto:ceph-users-leave@xxxxxxx>
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx <mailto:ceph-users@xxxxxxx>
> To unsubscribe send an email to ceph-users-leave@xxxxxxx <mailto:
> ceph-users-leave@xxxxxxx>
>
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx