Hi Alex... thank you for the tips! Yesterday I've made a lot of testing
and it seems that my network is really what is holding the speed down. I
just like to confirm if this is not really a problem or
misconfigurantion in my cluster that would be masked by the network
upgrade. The cache make the things really better, the second time I read
a file, even if I drop the caches at the guest OS, the data is read at
the netwotk speed limit.
I dont know if it is normal, but in the tests with fio the KRDB shows a
great performance boost over librbd (50MB/s in KRBD, 28MB/s in librbd).
To check how much the network latency is slowing down things, I've
created a 4xSSD only pool with size 2, and set the osds on one host to
primary-affinity 0, when I run the test with this config, data was read
at 900MB/s and clat is under 1ms. When I turned primary-affinity to 1
again and run the same test, the bw dropped to 100MB/s only and the
higher clat is 250ms and the average 70ms.
I will post the difference on the speeds next week when I have the
network upgraded in the case anyone like to see the results.
Em 3/7/2018 10:38 PM, Alex Gorbachev escreveu:
On Wed, Mar 7, 2018 at 8:37 PM, Alex Gorbachev <ag@xxxxxxxxxxxxxxxxxxx> wrote:
On Wed, Mar 7, 2018 at 9:43 AM, Cassiano Pilipavicius
<cassiano@xxxxxxxxxxx> wrote:
Hi all, this issue already have been discussed in older threads and I've
already tried most of the solutions proposed in older threads.
I have a small and old ceph cluster (slarted in hammer and upgraded until
luminous 12.2.2) , connected thru single 1gbe link shared (I know this is
not optimal but for my workload it is handling the load reasonably well). I
use for RBD for small VMs in libvirtu/qemu.
My problem is... If i need to copy a large file (cp, dd, tar), the read
speed is very low (15MB/s). I've tested the write speed of a single job with
dd zero (direct) > file and the speed is good enought for my environment
(80MB/s)
If I run paralell jobs, I can saturate the network connection, the speed
scales with the number of jobs. I've tried setting read ahead on ceph.conf
and in the guest O.S
I've never heard any report of a cluster using single 1gbe, maybe this speed
is what should I expect? The next week I will be upgrading the network for 2
x 10gbe (private and public) but I would like to know if I have any issue
that I need to address before, as the problem can be masked by the network
upgrade.
If anyone can throw some light or point me in any direction or tell me....
this is what you should expect.... I really apreciate. If anyone need more
info please let me know.
Workarounds I have heard of or used:
1. Use fancy striping and parallelize that way
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-April/017744.html
2. Use lvm and set up a striped volume over multiple RBDs
3. Weird but we had seen improvement in sequential speeds with larger
object size (16 MB) in the past
4. Caching solutions may help smooth out peaks and valleys of IO -
bcache, flashcache and we have successfully used EnhanceIO with
writethrough mode
5. Better SSD journals help if using filestore
6. Caching controllers, e.g. Areca
--
Alex Gorbachev
Storcium
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com