Re: Question about speeding hdd based cluster

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Sure:

https://docs.ceph.com/en/latest/ceph-volume/lvm/newdb/

In this case you'll have to prepare the db LV beforehand. I haven't done that in a while, here's an example from Clyso:

https://docs.clyso.com/blog/ceph-volume-create-wal-db-on-separate-device-for-existing-osd

Note that in a cephadm deployment you'll need to execute that in a shell, for example:

cephadm shell --name osd.6 --env CEPH_ARGS='--bluestore_block_db_size=1341967564' -- ceph-bluestore-tool bluefs-bdev-new-db --dev-target /dev/data_vg1/lv4 --path /var/lib/ceph/osd/ceph-6

Note that these are two different approaches to achieve the same goal. One is via 'ceph-volume lvm new-db', the other one with 'ceph-bluestore-tool bluefs-bdev-new-db'. I would assume they both work, so I can't tell which one to prefer. I feel like the docs could use some clarification on this topic.

On a similar topic: Does it make sense to use compression on a metadata pool? Would it matter if the metadata pool is on hdd vs ssd?

As already stated, metadata should be on fast devices, independent of compression. The metadata pool doesn't consume a lot of data, so I'd say there's not too much of a benefit compressing that.

Zitat von "Kyriazis, George" <george.kyriazis@xxxxxxxxx>:

On Oct 7, 2024, at 2:16 AM, Eugen Block <eblock@xxxxxx> wrote:

Hi, response inline.


Zitat von "Kyriazis, George" <george.kyriazis@xxxxxxxxx>:

Thank you all.

The cluster is used mostly for backup of large files currently, but we are hoping to use it for home directories (compiles, etc.) soon. Most usage would be for large files, though.

What I've observed with its current usage is that ceph rebalances, and proxmox-initiated VM backups bring the storage to its knees.

Would a safe approach be to move the metadata pool to ssd first, see how it goes (since it would be cheaper), and then add DB/WAL disks?

Moving the metadata to SSDs first is absolutely reasonable and relatively cheap since it usually doesn't contain huge amounts of data.

How would ceph behave if we are adding DB/WAL disks "slowly" (ie one node at a time)? We have about 100 OSDs (mix hdd/ssd) spread across about 25 hosts. Hosts are server-grade with plenty of memory and processing power.

The answer is as always "it depends". If you rebuild the OSDs entirely (host-wise) instead of migrating the DB off to SSDs, you might encounter slow requests as you already noticed yourself. But the whole process would be faster than migrating each DB individually. If you take the migration approach, it would be less invasive, each OSD would just have to catch up after restart, reducing the load drastically compared to a rebuild. But then again, it would take way more time to complete. How large are the OSDs and how much are they utilized? Do you have some history how long a host rebuild would usually take?


I have no problem destroying and re-creating the OSDs (in place) if that’s what it takes. It will take time to do them all, but if “eventually” it works better, then so be it. Do you happen to have a documentation pointer no how to migrate DB to SSDs?

On a similar topic: Does it make sense to use compression on a metadata pool? Would it matter if the metadata pool is on hdd vs ssd?

Thank you!

George

Thank you!

George


-----Original Message-----
From: Eugen Block <eblock@xxxxxx>
Sent: Wednesday, October 2, 2024 2:18 AM
To: ceph-users@xxxxxxx
Subject:  Re: Question about speeding hdd based cluster

Hi George,

the docs [0] strongly recommend to have dedicated SSD or NVMe OSDs for
the metadata pool. You'll also benefit from dedicated DB/WAL devices.
But as Joachim already stated, it depends on a couple of factors like the
number of clients, the load they produce, file sizes etc. There's no easy answer.

Regards,
Eugen

[0] https://docs.ceph.com/en/latest/cephfs/createfs/#creating-pools

Zitat von Joachim Kraftmayer <joachim.kraftmayer@xxxxxxxxx>:

> Hi Kyriazis,
>
> depends on the workload.
> I would recommend to add  ssd/nvme DB/WAL to each osd.
>
>
>
> Joachim Kraftmayer
>
> www.clyso.com
>
> Hohenzollernstr. 27, 80801 Munich
>
> Utting a. A. | HR: Augsburg | HRB: 25866 | USt. ID-Nr.: DE2754306
>
> Kyriazis, George <george.kyriazis@xxxxxxxxx> schrieb am Mi., 2. Okt.
> 2024,
> 07:37:
>
>> Hello ceph-users,
>>
>> I’ve been wondering…. I have a proxmox hdd-based cephfs pool with no
>> DB/WAL drives. I also have ssd drives in this setup used for other pools.
>>
>> What would increase the speed of the hdd-based cephfs more, and in
>> what usage scenarios:
>>
>> 1. Adding ssd/nvme DB/WAL drives for each node 2. Moving the metadata
>> pool for my cephfs to ssd 3. Increasing the performance of the
>> network.  I currently have 10gbe links.
>>
>> It doesn’t look like the network is currently saturated, so I’m
>> thinking
>> (3) is not a solution.  However, if I choose any of the other
>> options, would I need to also upgrade the network so that the network
>> does not become a bottleneck?
>>
>> Thank you!
>>
>> George
>>
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an
>> email to ceph-users-leave@xxxxxxx
>>
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an
> email to ceph-users-leave@xxxxxxx


_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to
ceph-users-leave@xxxxxxx





_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux