Hello,
for the replicated one, is it a new issue or you just didn't notice before ? Same baseline as before?
I also have slowness with small files/many files.
For now I could only tune up things with:
On gluster level:
gluster volume set myvolume performance.io-thread-count 16
gluster volume set myvolume performance.cache-size 1GB
gluster volume set myvolume nfs.disable on
gluster volume set myvolume readdir-ahead enable
gluster volume set myvolume read-ahead disable
On network level (client and server) (I don't have infiniband):
sysctl -w vm.swappiness=0
sysctl -w net.core.rmem_max=67108864
sysctl -w net.core.wmem_max=67108864
# increase Linux autotuning TCP buffer limit to 32MB
sysctl -w net.ipv4.tcp_rmem="4096 87380 33554432"
sysctl -w net.ipv4.tcp_wmem="4096 65536 33554432"
# increase the length of the processor input queue
sysctl -w net.core.netdev_max_backlog=30000
# recommended default congestion control is htcp
sysctl -w net.ipv4.tcp_congestion_control=htcp
# increase Linux autotuning TCP buffer limit to 32MB
sysctl -w net.ipv4.tcp_rmem="4096 87380 33554432"
sysctl -w net.ipv4.tcp_wmem="4096 65536 33554432"
# increase the length of the processor input queue
sysctl -w net.core.netdev_max_backlog=30000
# recommended default congestion control is htcp
sysctl -w net.ipv4.tcp_congestion_control=htcp
But it's still really slow, even if better
2015-06-20 2:34 GMT+02:00 Geoffrey Letessier <geoffrey.letessier@xxxxxxx>:
Re,For comparison, here is the output of the same script run on a distributed only volume (2 servers of the 4 previously described, 2 bricks each):####################################################################### UNTAR time consumed #######################################################################real 1m44.698suser 0m8.891ssys 0m8.353s######################################################################## DU time consumed #########################################################################554M linux-4.1-rc6real 0m21.062suser 0m0.100ssys 0m1.040s######################################################################## FIND time consumed #######################################################################52663real 0m21.325suser 0m0.104ssys 0m1.054s######################################################################## GREP time consumed #######################################################################7952real 0m43.618suser 0m0.922ssys 0m3.626s######################################################################## TAR time consumed ########################################################################real 0m50.577suser 0m29.745ssys 0m4.086s######################################################################## RM time consumed #########################################################################real 0m41.133suser 0m0.171ssys 0m2.522sThe performances are amazing different!Geoffrey-----------------------------------------------
Geoffrey Letessier
Responsable informatique & ingénieur système
CNRS - UPR 9080 - Laboratoire de Biochimie Théorique
Institut de Biologie Physico-Chimique
13, rue Pierre et Marie Curie - 75005 Paris
Tel: 01 58 41 50 93 - eMail: geoffrey.letessier@xxxxxxxLe 20 juin 2015 à 02:12, Geoffrey Letessier <geoffrey.letessier@xxxxxxx> a écrit :<benches.txt>Dear all,I just noticed on my main volume of my HPC cluster my IO operations become impressively poor..Doing some file operations above a linux kernel sources compressed file, the untar operation can take more than 1/2 hours for this file (roughly 80MB and 52 000 files inside) as you read below:####################################################################### UNTAR time consumed #######################################################################real 32m42.967suser 0m11.783ssys 0m15.050s######################################################################## DU time consumed #########################################################################557M linux-4.1-rc6real 0m25.060suser 0m0.068ssys 0m0.344s######################################################################## FIND time consumed #######################################################################52663real 0m25.687suser 0m0.084ssys 0m0.387s######################################################################## GREP time consumed #######################################################################7952real 2m15.890suser 0m0.887ssys 0m2.777s######################################################################## TAR time consumed ########################################################################real 1m5.551suser 0m26.536ssys 0m2.609s######################################################################## RM time consumed #########################################################################real 2m51.485suser 0m0.167ssys 0m1.663sFor information, this volume is a distributed replicated one and is composed by 4 servers with 2 bricks each. Each bricks is a 12-drives RAID6 vdisk with nice native performances (around 1.2GBs).In comparison, when I use DD to generate a 100GB file on the same volume, my write throughput is around 1GB (client side) and 500MBs (server side) because of replication:Client side:[root@node056 ~]# ifstat -i ib0ib0KB/s in KB/s out3251.45 1.09e+063139.80 1.05e+063185.29 1.06e+063293.84 1.09e+06...Server side:[root@lucifer ~]# ifstat -i ib0ib0KB/s in KB/s out561818.1 1746.42560020.3 1737.92526337.1 1648.20513972.7 1613.69...DD command:[root@node056 ~]# dd if=/dev/zero of=/home/root/test.dd bs=1M count=100000100000+0 enregistrements lus100000+0 enregistrements écrits104857600000 octets (105 GB) copiés, 202,99 s, 517 MB/sSo this issue doesn’t seem coming from the network (which is Infiniband technology in this case)You can find in attachments a set of files:- mybench.sh: the bench script- benches.txt: output of my "bench"- profile.txt: gluster volume profile during the "bench"- vol_status.txt: gluster volume status- vol_info.txt: gluster volume infoCan someone help me to fix it (it’s very critical because this volume is on a HPC cluster in production).Thanks by advance,Geoffrey-----------------------------------------------
Geoffrey Letessier
Responsable informatique & ingénieur système
CNRS - UPR 9080 - Laboratoire de Biochimie Théorique
Institut de Biologie Physico-Chimique
13, rue Pierre et Marie Curie - 75005 Paris
Tel: 01 58 41 50 93 - eMail: geoffrey.letessier@xxxxxxx<mybench.sh><profile.txt><vol_info.txt><vol_status.txt>
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-users