On 10/6/22 08:35, Yoann Moulin wrote:
Is 256 good value in our case ? We have 80TB of data with more than 300M files.
You want at least as many PGs that each of the OSDs host a portion of the OMAP data. You want to spread out OMAP to as many _fast_ OSDs as possible.
I have tried to find an answer to your question: are more metadata PGs better? I haven't found a definitive answer. This would ideally be tested in a non-prod / pre-prod environment and tuned to individual requirements (type of workload). For now, I would not blindly trust the PG autoscaler. I have seen it advise settings that would definately not be OK. You can skew things in the autoscaler with the "bias" parameter, to compensate for this. But as far as I know the current heuristics to determine a good value do not take into account the importance of OMAP (RocksDB) spread accross OSDs. See a blog post about autoscaler tuning [1].
It would be great if tuning metadata PGs for CephFS / RGW could be performed during the "large scale tests" the devs are planning to perform in the future. With use cases that take into consideration "a lot of small files / objects" versus "loads of large files / objects" to get a feeling how tuning this impacts performance for different work loads.
Gr. Stefan [1]: https://ceph.io/en/news/blog/2022/autoscaler_tuning/ _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx