CephFS subdatapool in practice?

"Otto Richter (Codeberg e.V.)" <otto@xxxxxxxxxxxx> · Fri, 17 Jan 2025 20:29:42 +0100

Hello everyone,

I am maintaining a cost-optimized Ceph cluster at Codeberg.org, that is serving
a lot of small random IO (Git operations).

Currently, the data is stored in a single CephFS pool on mixed SSD and HDD
cluster, where the HDDs have a primary affinity of 0 to shift the reads to the
SSDs.

In a quest to further optimize our setup, I discovered the subdatapool feature
described at https://github.com/TheJJ/ceph-cheatsheet#subdatapool (assigning
multiple pools to a filesystem and using setfattr to allow writes in a certain
folder to a separate pool).

It feels like this would allow us to cut cost by moving certain directories with
large files to a cheaper pool than that used for Git operations easily, without
doing data migrations such as splitting data into multiple filesystems.

Is anyone using this feature in production and can comment on it?

Are there best practices regarding the usage of this feature?

Any caveats to be aware of, such as overhead, latency, long-term
maintainability?

Since certain "interesting" features have been deprecated in the past, such as
inline_data, I thought it would be better to ask here for feedback first.

Thank you for the comments and have a nice day!

Kind Regards
Otto

-- 
https://codeberg.org
Codeberg e.V.  –  Arminiusstraße 2-4  –  10551 Berlin  –  Germany
Registered at registration court Amtsgericht Charlottenburg VR36929.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx