Yeah, there are cases where enabling it will improve performance as
rocksdb can then used the page cache as a (potentially large) secondary
cache beyond the block cache and avoid hitting the underlying devices
for reads. Do you have a lot of spare memory for page cache on your OSD
nodes? You may be able to improve the situation with
bluefs_buffered_io=false by increasing the osd_memory_target which
should give the rocksdb block cache more memory to work with directly.
One downside is that we currently double cache onodes in both the
rocksdb cache and bluestore onode cache which hurts us when memory
limited. We have some experimental work that might help in this area by
better balancing bluestore onode and rocksdb block caches but it needs
to be rebased after Adam's column family sharding work.
The reason we had to disable bluefs_buffered_io again was that we had
users with certain RGW workloads where the kernel started swapping large
amounts of memory on the OSD nodes despite seemingly have free memory
available. This caused huge latency spikes and IO slowdowns (even
stalls). We never noticed it in our QA test suites and it doesn't
appear to happen with RBD workloads as far as I can tell, but when it
does happen it's really painful.
Mark
On 8/6/20 6:53 AM, Manuel Lausch wrote:
Hi,
I found the reasen of this behavior change.
With 14.2.10 the default value of "bluefs_buffered_io" was changed from
true to false.
https://tracker.ceph.com/issues/44818
configureing this to true my problems seems to be solved.
Regards
Manuel
On Wed, 5 Aug 2020 13:30:45 +0200
Manuel Lausch <manuel.lausch@xxxxxxxx> wrote:
Hello Vladimir,
I just tested this with a single node testcluster with 60 HDDs (3 of
them with bluestore without separate wal and db).
With the 14.2.10, I see on the bluestore OSDs a lot of read IOPs while
snaptrimming. With 14.2.9 this was not an issue.
I wonder if this would explain the huge amount of slowops on my big
testcluster (44 Nodes 1056 OSDs) while snaptrimming. I
cannot test a downgrade there, because there are no packages of older
releases for CentOS 8 available.
Regards
Manuel
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx