On Tue, Jul 27, 2021 at 1:46 PM Marcel Kuiper <ceph@xxxxxxxx> wrote: > > We recently upgraded one of our clusters from 14.2.21 to 15.2.13 and had > some troubles with osd performance. After reading some threads we > manually compacted the rocksdb's on all osds and hope that that will > alleviate the problem > > I noted that in the 15.2.13 release bluefs_buffered_io is now default > set to enabled. As I understand it this will make use of the linux > buffer cache for some rocksdb operations. Previously under nautilus we > had increased osd_memory_target (bluefs_buffered_io was default disabled > at that time). I now wonder whether we should tune that down a bit to > leave more memory for the linux buffer cache. > > Are there any recommendations on how to divide memory on a storage > system? I don't think that general recommendations are really feasible. The goal is for defaults to be reasonably well performing, but if you want to tune things, you should have some performance metric relevant for your environment then make small adjustments to study their impact. Regarding bluefs_buffered_io = true -- the main issue it solved was related to rocksdb metadata being re-read over and over from the underlying device during PG removal. While debugging that issue, switching it on / off dramatically impacted the IO on the block.db (i.e. 1MBps when buffered_io is true, 300MBps when false). Since that change responded so quickly, I would guess that only a small amount of page cache is needed for that particular issue. Cheers, dan _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx