Re: GlusterFS 3.5.3 - untar: very poor performance

Mathieu Chateau <mathieu.chateau@xxxxxxx> · Sat, 20 Jun 2015 09:12:50 +0200

Hello,
for the replicated one, is it a new issue or you just didn't notice before ? Same baseline as before?

I also have slowness with small files/many files.

For now I could only tune up things with:

On gluster level:
gluster volume set myvolume performance.io-thread-count 16
gluster volume set myvolume  performance.cache-size 1GB
gluster volume set myvolume nfs.disable on
gluster volume set myvolume readdir-ahead enable
gluster volume set myvolume read-ahead disable

On network level (client and server) (I don't have infiniband):
sysctl -w vm.swappiness=0
sysctl -w net.core.rmem_max=67108864
sysctl -w net.core.wmem_max=67108864

# increase Linux autotuning TCP buffer limit to 32MB

sysctl -w net.ipv4.tcp_rmem="4096 87380 33554432"

sysctl -w net.ipv4.tcp_wmem="4096 65536 33554432"

# increase the length of the processor input queue

sysctl -w net.core.netdev_max_backlog=30000

# recommended default congestion control is htcp

sysctl -w net.ipv4.tcp_congestion_control=htcp

But it's still really slow, even if better

Cordialement,
Mathieu CHATEAU
http://www.lotp.fr

2015-06-20 2:34 GMT+02:00 Geoffrey Letessier <geoffrey.letessier@xxxxxxx>:
Re,
For comparison, here is the output of the same script run on a distributed only volume (2 servers of the 4 previously described, 2 bricks each):#######################################################
################  UNTAR time consumed  ################
#######################################################

real	1m44.698s
user	0m8.891s
sys	0m8.353s

#######################################################
#################  DU time consumed  ##################
#######################################################

554M	linux-4.1-rc6

real	0m21.062s
user	0m0.100s
sys	0m1.040s

#######################################################
#################  FIND time consumed  ################
#######################################################

52663

real	0m21.325s
user	0m0.104s
sys	0m1.054s

#######################################################
#################  GREP time consumed  ################
#######################################################

7952

real	0m43.618s
user	0m0.922s
sys	0m3.626s

#######################################################
#################  TAR time consumed  #################
#######################################################

real	0m50.577s
user	0m29.745s
sys	0m4.086s

#######################################################
#################  RM time consumed  ##################
#######################################################

real	0m41.133s
user	0m0.171s
sys	0m2.522s

The performances are amazing different!

Geoffrey

-----------------------------------------------
Geoffrey Letessier

Responsable informatique & ingénieur système
CNRS - UPR 9080 - Laboratoire de Biochimie Théorique
Institut de Biologie Physico-Chimique
13, rue Pierre et Marie Curie - 75005 Paris
Tel: 01 58 41 50 93 - eMail: geoffrey.letessier@xxxxxxx

Le 20 juin 2015 à 02:12, Geoffrey Letessier <geoffrey.letessier@xxxxxxx> a écrit :

Dear all,
I just noticed on my main volume of my HPC cluster my IO operations become impressively poor.. 

Doing some file operations above a linux kernel sources compressed file, the untar operation can take more than 1/2 hours for this file (roughly 80MB and 52 000 files inside) as you read below:
#######################################################
################  UNTAR time consumed  ################
#######################################################

real	32m42.967s
user	0m11.783s
sys	0m15.050s

#######################################################
#################  DU time consumed  ##################
#######################################################

557M	linux-4.1-rc6

real	0m25.060s
user	0m0.068s
sys	0m0.344s

#######################################################
#################  FIND time consumed  ################
#######################################################

52663

real	0m25.687s
user	0m0.084s
sys	0m0.387s

#######################################################
#################  GREP time consumed  ################
#######################################################

7952

real	2m15.890s
user	0m0.887s
sys	0m2.777s

#######################################################
#################  TAR time consumed  #################
#######################################################

real	1m5.551s
user	0m26.536s
sys	0m2.609s

#######################################################
#################  RM time consumed  ##################
#######################################################

real	2m51.485s
user	0m0.167s
sys	0m1.663s

For information, this volume is a distributed replicated one and is composed by 4 servers with 2 bricks each. Each bricks is a 12-drives RAID6 vdisk with nice native performances (around 1.2GBs).

In comparison, when I use DD to generate a 100GB file on the same volume, my write throughput is around 1GB (client side) and 500MBs (server side) because of replication:
Client side:
[root@node056 ~]# ifstat -i ib0
       ib0        
 KB/s in  KB/s out
 3251.45  1.09e+06
 3139.80  1.05e+06
 3185.29  1.06e+06
 3293.84  1.09e+06
...

Server side:
[root@lucifer ~]# ifstat -i ib0
       ib0        
 KB/s in  KB/s out
561818.1   1746.42
560020.3   1737.92
526337.1   1648.20
513972.7   1613.69
...

DD command:
[root@node056 ~]# dd if=/dev/zero of=/home/root/test.dd bs=1M count=100000
100000+0 enregistrements lus
100000+0 enregistrements écrits
104857600000 octets (105 GB) copiés, 202,99 s, 517 MB/s

So this issue doesn’t seem coming from the network (which is Infiniband technology in this case)

You can find in attachments a set of files:
	- mybench.sh: the bench script
	- benches.txt: output of my "bench"
	- profile.txt: gluster volume profile during the "bench"
	- vol_status.txt: gluster volume status
	- vol_info.txt: gluster volume info

Can someone help me to fix it (it’s very critical because this volume is on a HPC cluster in production).

Thanks by advance,
Geoffrey

-----------------------------------------------
Geoffrey Letessier

Responsable informatique & ingénieur système
CNRS - UPR 9080 - Laboratoire de Biochimie Théorique
Institut de Biologie Physico-Chimique
13, rue Pierre et Marie Curie - 75005 Paris
Tel: 01 58 41 50 93 - eMail: geoffrey.letessier@xxxxxxx

<benches.txt>
<mybench.sh>
<profile.txt>
<vol_info.txt>
<vol_status.txt>

_______________________________________________

Gluster-users mailing list

Gluster-users@xxxxxxxxxxx

http://www.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users