Il 21-01-2020 11:40 Yaniv Kaul ha scritto:
How did you fix this? How did you spot this?
I used iperf3 between the two hosts. It shows that, albeit bandwidth was near the 1 Gbps limit, there were frequent retransmissions. "netstat -s | grep retran" confirmed that retransmissions happened during my gluster tests.
fsync requires each write to land on disk and get an ack on it - it's probably the slowest kind of write you can imagine, and you seem be doing it for every small (4K?) block. This is not very realistic. But in your case you say you are writing to /dev/shm, so that is strange. Can you try with a different fio engine and see what you get?
True, but putting my briks on /dev/shm means they are actually are in-memory, with no disks/seeks slowing down the syncs. I also tried with a very simple "dd if=/dev/urandom of=test.img bs=4k count=1024" and the results were idential (about 250 IOPs).
To exclude network latency, I did a local, two-brick /dev/shm backed volume (both bricks were on the same machine). Locally mounting the gluser volume via fuse, I only get about 500-600 IOPs.
What are you trying to test here?
Database and virtual machine performance (which are both fsync and 4k heavy).
Good question. Perhaps profiling would help here. Perhaps too many threads are contending for CPU? Some lock contention?
What kind or profiling should I do? strace on glusterd process? perf top/stat?
Thanks. -- Danti Gionatan Supporto Tecnico Assyoma S.r.l. - www.assyoma.it email: g.danti@xxxxxxxxxx - info@xxxxxxxxxx GPG public key ID: FF5F32A8 ________ Community Meeting Calendar: APAC Schedule - Every 2nd and 4th Tuesday at 11:30 AM IST Bridge: https://bluejeans.com/441850968 NA/EMEA Schedule - Every 1st and 3rd Tuesday at 01:00 PM EDT Bridge: https://bluejeans.com/441850968 Gluster-users mailing list Gluster-users@xxxxxxxxxxx https://lists.gluster.org/mailman/listinfo/gluster-users