My testing cluster is an all hdd cluster with 12
osd(10T hdd each).
I moinitor luminous 12.2.2 write performance and
osd memory usage with grafana graph for statistic logging.
The test is done by using fio on a
mounted rbd with follow fio parameters:
fio -directory=fiotest -direct=1 -thread -rw=write -ioengine=libaio
-size=200G -group_reporting -bs=1m -iodepth 4 -numjobs=200 -name=writetest
I found there is a noticeably
performance degration over time.
Graph of write throughput and
iops
Graph of osd memory usage(2 of 12
osds,the pattern are identical)
Graph of osd perf
There are some interesting founding
from the graph.
After 18:00 suddenly the write
throughput dropped and the osd latency increased. TCmalloc started relcaim
page heap freelist much more frequently.All of this happened very fast and every
osd had the indentical pattern.
I have done this kind of test several
times with different bluestore cache setting and find out with more cache the
performance degradation would happen later.
I don't know if this is a bug or I
can fix it with modify some of the config of my
cluster. Any advice or direction to look into is
appreciated.
Thanks
2017-12-21
lin.yunfan |
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com