Re: Understanding gluster performance

Gionatan Danti <g.danti@xxxxxxxxxx> · Tue, 21 Jan 2020 18:16:56 +0100

Il 21-01-2020 11:40 Yaniv Kaul ha scritto:
How did you fix this?
How did you spot this?

I used iperf3 between the two hosts. It shows that, albeit bandwidth was 
near the 1 Gbps limit, there were frequent retransmissions. "netstat -s 
| grep retran" confirmed that retransmissions happened during my gluster 
tests.

fsync requires each write to land on disk and get an ack on it - it's
probably the slowest kind of write you can imagine, and you seem be
doing it for every small (4K?) block.
This is not very realistic. But in your case you say you are writing
to /dev/shm, so that is strange. Can you try with a different fio
engine and see what you get?

True, but putting my briks on /dev/shm means they are actually are 
in-memory, with no disks/seeks slowing down the syncs. I also tried with 
a very simple "dd if=/dev/urandom of=test.img bs=4k count=1024" and the 
results were idential (about 250 IOPs).

To exclude network latency, I did a local, two-brick /dev/shm backed 
volume (both bricks were on the same machine). Locally mounting the 
gluser volume via fuse, I only get about 500-600 IOPs.

What are you trying to test here?

Database and virtual machine performance (which are both fsync and 4k 
heavy).

Good question. Perhaps profiling would help here. Perhaps too many
threads are contending for CPU? Some lock contention?

What kind or profiling should I do? strace on glusterd process? perf 
top/stat?

Thanks.

--
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@xxxxxxxxxx - info@xxxxxxxxxx
GPG public key ID: FF5F32A8
________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users