Hi,
So, can anyone try and reproduce this problem? I've downgraded to
v3.5.2, which I'm using in prod, and I get the same behavior.
Steps to reproduce:
1. probe server2, create and start volume
2. do not mount volume
3. reboot/poweroff server2; or add server1 to its iptables (with -j
DROP, not -j REJECT)
4. on server1 (while server2 is rebooting or dropping traffic from
server1): time mount -t glusterfs server1:/volume /some/path
PS: with -j REJECT it mounts instantly. with -j DROP it always waits
2mins 7secs
Thanks!
On 11/11/2014 01:19, Pranith Kumar
Karampuri wrote:
On 11/10/2014 11:47 PM, A F wrote:
Hello,
I have two servers, 192.168.0.10 and 192.168.2.10. I'm using
gluster 3.6.1 (installed from gluster repo) on AWS Linux. Both
servers are completely reachable in LAN.
# rpm -qa|grep gluster
glusterfs-3.6.1-1.el6.x86_64
glusterfs-server-3.6.1-1.el6.x86_64
glusterfs-libs-3.6.1-1.el6.x86_64
glusterfs-api-3.6.1-1.el6.x86_64
glusterfs-cli-3.6.1-1.el6.x86_64
glusterfs-fuse-3.6.1-1.el6.x86_64
These are the commands I ran:
# gluster peer probe 192.168.2.10
# gluster volume create aloha replica 2 transport tcp
192.168.0.10:/var/aloha 192.168.2.10:/var/aloha force
# gluster volume start aloha
# gluster volume set aloha network.ping-timeout 5
# gluster volume set aloha nfs.disable on
Problem number 1:
tail -f /var/log/glusterfs/etc-glusterfs-glusterd.vol.log shows
log cluttering with:
[2014-11-10 17:41:26.328796] W [socket.c:611:__socket_rwv]
0-management: readv on
/var/run/38c520c774793c9cdae8ace327512027.socket failed (Invalid
argument)
this happens every 3 seconds on both servers. It is related to
NFS and probably rpcbind, but I absolutely want them disabled.
As you see, I've set gluster to disable nfs - why doesn't it
keep quiet about it then?
Problem number 2:
in fstab on server 192.168.0.10: 192.168.0.10:/aloha
/var/www/hawaii glusterfs defaults,_netdev 0 0
in fstab on server 192.168.2.10: 192.168.2.10:/aloha
/var/www/hawaii glusterfs defaults,_netdev 0 0
If I shutdown one of the servers (192.168.2.10), and I reboot
the remaining one (192.168.0.10), it won't come up as fast as it
should. It lags a few minutes waiting for gluster. After it
eventually starts, mount point is not mounted and volume is
stopped:
# gluster volume status
Status of volume: aloha
Gluster process Port
Online Pid
------------------------------------------------------------------------------
Brick 192.168.0.10:/var/aloha N/A
N N/A
Self-heal Daemon on localhost N/A
N N/A
Task Status of Volume aloha
------------------------------------------------------------------------------
There are no active volume tasks
This didn't happen before, so fine, I first have to stop the
volume and then start it again. It now shows as online:
Brick 192.168.0.10:/var/aloha 49155
Y 3473
Self-heal Daemon on localhost N/A
Y 3507
# time mount -a
real 2m7.307s
# time mount -t glusterfs 192.168.0.10:/aloha /var/www/hawaii
real 2m7.365s
# strace mount -t glusterfs 192.168.0.10:/aloha /var/www/hawaii
(attached)
# tail /var/log/glusterfs/* -f|grep -v readv
(attached)
I've done this setup before, so I'm amazed it doesn't work. I
even have it in production at the moment, with the same options
and setup, and for example I'm not getting readv errors. I'm
unable to test the mount part though, but I feel I have covered
it way back when I was testing the environment.
Any help is kindly appreciated.
CC glusterd folks
Pranith
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users
|
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users