Hi list,
Ceph version is jewel 10.2.10 and all
osd are using filestore.
The Cluster has 96 osds and 1
pool with size=2 replication with 4096 pg(base on pg calculate
method from ceph doc for 100pg/per osd).
The osd with the most pg count has 104 PGs and
there are 6 osds have above 100 PGs
Most of the osd have around 7x-9x PGs
The osd with the least pg count has 58 PGs
During the write test some of the osds have very
high fs_apply_latency like 1000ms-4000ms while the normal ones are like
100-600ms. The osds with high latency are always the ones with more pg on
it.
iostat on the high latency osd shows the hdds
are having high %util at about 95%-96% while the normal ones are having
%util at 40%-60%
I think the reason to cause this is because the
osds have more pgs need to handle more write request to it.Is this right?
But even though the pg distribution is not even
but the variation is not that much.How could the performance be so sensitive to
it?
Is there anything I can do to improve the
performance and reduce the latency?
How can I make the pg distribution to be more
even?
Thanks
2018-03-07
shadowlin
|
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com