Hi everyone,
I am trying to run multipal jobs using fio benchamark in replica volume
with 3 bricks, but some hours later, warning message “W [socket.c:195:__socket_rwv]
0-tcp.ida-server: readv failed (Connection timed out)” appear in bricks logs, I think this
waring may due to high work loads, glusterfs with high work loads can not respond
socket timely. So I add codes in rpc/rpc-transport/socket/src/socket.c to expand timeout
threshold of socket, now the SO_RCVTIMEO is 180s, KEEP_ALIVE is 300s, then run
work load again, but it does not work.
My test enviroment is as
follow:
Three nodes work as gluster cluster, each nodes with 16GB memory, 8 core
3.3GHz cpu, two 10000baseT/full and one 1000baseT/full network cards, each nodes
use 16 * 2T raid5 disks working as brick. The glusterfs version is 3.3.1.
I create a 1*3 replica volume use this three nodes, every node use fuse to mount
volume through a 10000baseT/full network card. At the sametime, every node use cifs to
mount fuse_mount_point through another 10000baseT/full card.
Each node run two fio scripts, read and write jobs. Both scripts do operation in
Three nodes work as gluster cluster, each nodes with 16GB memory, 8 core
3.3GHz cpu, two 10000baseT/full and one 1000baseT/full network cards, each nodes
use 16 * 2T raid5 disks working as brick. The glusterfs version is 3.3.1.
I create a 1*3 replica volume use this three nodes, every node use fuse to mount
volume through a 10000baseT/full network card. At the sametime, every node use cifs to
mount fuse_mount_point through another 10000baseT/full card.
Each node run two fio scripts, read and write jobs. Both scripts do operation in
cifs_mount_point. scripts is as follows:
write_jobs:
while true
do
mkdir -p ${DIR}_write_${i}
/usr/local/bin/fio --ioengine=libaio --iodepth=256 --numjobs=100 --rw=write --bs=1k --size=1000m --directory=${DIR}_write_${i} --name=job01_1k_write >> ${DIR}_write_${i}/job01_1k_write.log
i=`expr $i + 1`
done
done
read jobs:
mkdir -p ${DIR}_read_${i}
/usr/local/bin/fio --ioengine=libaio --iodepth=256 --numjobs=100 --rw=read --bs=1k --size=1000m --directory=${DIR}_read_${i} --name=job01_1k_read >> ${DIR}_read_${i}/job01_1k_read.log
i=`expr $i + 1`
done
mkdir -p ${DIR}_read_${i}
/usr/local/bin/fio --ioengine=libaio --iodepth=256 --numjobs=100 --rw=read --bs=1k --size=1000m --directory=${DIR}_read_${i} --name=job01_1k_read >> ${DIR}_read_${i}/job01_1k_read.log
i=`expr $i + 1`
done
I change iodepth from 256 to 16, numjobs from 100 to 25, but it still does not
work. Is there anybody pay attention to this problem?
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://supercolony.gluster.org/mailman/listinfo/gluster-users