Hello all,
I've been struggling to get gluster-geo replicate
functionality working for the last couple of days. I keep
getting the following errors:
2015-08-10 17:27:07.855817] E
[resource(/gluster/volume1):222:errlog] Popen: command "ssh
-oPasswordAuthentication=no -oStrictHostKeyChecking=no -i
/var/lib/glusterd/geo-replication/secret.pem
-oControlMaster=auto -S
/tmp/gsyncd-aux-ssh-Cnh7xL/ee1e6b6c8823302e93454e632bd81fbe.sock
root@xxxxxxxxxxxxxxxxxxxxx
/nonexistent/gsyncd --session-owner
50600483-7aa3-4fab-a66c-63350af607b0 -N --listen --timeout
120 gluster://localhost:volume1-replicate" returned with
127, saying:
[2015-08-10 17:27:07.856066] E
[resource(/gluster/volume1):226:logerr] Popen: ssh> bash:
/nonexistent/gsyncd: No such file or directory
[2015-08-10 17:27:07.856441] I
[syncdutils(/gluster/volume1):220:finalize] <top>:
exiting.
[2015-08-10 17:27:07.858120] I
[repce(agent):92:service_loop] RepceServer: terminating on
reaching EOF.
[2015-08-10 17:27:07.858361] I
[syncdutils(agent):220:finalize] <top>: exiting.
[2015-08-10 17:27:07.858211] I
[monitor(monitor):274:monitor] Monitor:
worker(/gluster/volume1) died before establishing connection
[2015-08-10 17:27:18.181344] I
[monitor(monitor):221:monitor] Monitor:
------------------------------------------------------------
[2015-08-10 17:27:18.181842] I
[monitor(monitor):222:monitor] Monitor: starting gsyncd
worker
[2015-08-10 17:27:18.389427] D [gsyncd(agent):643:main_i]
<top>: rpc_fd: '7,11,10,9'
[2015-08-10 17:27:18.390553] I
[changelogagent(agent):75:__init__] ChangelogAgent: Agent
listining...
[2015-08-10 17:27:18.418788] D
[repce(/gluster/volume1):191:push] RepceClient: call
8460:140341431777088:1439242038.42 __repce_version__() ...
[2015-08-10 17:27:18.629983] E
[syncdutils(/gluster/volume1):252:log_raise_exception]
<top>: connection to peer is broken
[2015-08-10 17:27:18.630651] W
[syncdutils(/gluster/volume1):256:log_raise_exception]
<top>: !!!!!!!!!!!!!
[2015-08-10 17:27:18.630929] W
[syncdutils(/gluster/volume1):265:log_raise_exception]
<top>: !!!!!!!!!!!!!
[2015-08-10 17:27:18.631129] E
[resource(/gluster/volume1):222:errlog] Popen: command "ssh
-oPasswordAuthentication=no -oStrictHostKeyChecking=no -i
/var/lib/glusterd/geo-replication/secret.pem
-oControlMaster=auto -S
/tmp/gsyncd-aux-ssh-RPuEyN/ee1e6b6c8823302e93454e632bd81fbe.sock
root@xxxxxxxxxxxxxxxxxxxxx
/nonexistent/gsyncd --session-owner
50600483-7aa3-4fab-a66c-63350af607b0 -N --listen --timeout
120 gluster://localhost:volume1-replicate" returned with
127, saying:
[2015-08-10 17:27:18.631280] E
[resource(/gluster/volume1):226:logerr] Popen: ssh> bash:
/nonexistent/gsyncd: No such file or directory
[2015-08-10 17:27:18.631567] I
[syncdutils(/gluster/volume1):220:finalize] <top>:
exiting.
[2015-08-10 17:27:18.633125] I
[repce(agent):92:service_loop] RepceServer: terminating on
reaching EOF.
[2015-08-10 17:27:18.633183] I
[monitor(monitor):274:monitor] Monitor:
worker(/gluster/volume1) died before establishing connection
[2015-08-10 17:27:18.633392] I
[syncdutils(agent):220:finalize] <top>: exiting.
and the status is continuously faulty:
[root@neptune volume1]# gluster volume geo-replication
volume01 gluster02::volume01-replicate status
MASTER NODE MASTER VOL MASTER BRICK SLAVE
USER SLAVE SLAVE NODE
STATUS CRAWL STATUS LAST_SYNCED
-------------------------------------------------------------------------------------------------------------------------------------------------------
neptune volume01 /gluster/volume01 root
gluster02::volume01-replicate N/A Faulty
N/A N/A
What I'm trying to accomplish is to mirror a volume from
gluster01 (master) to gluster02 (slave).
Here is a break down of the steps I took
yum -y install glusterfs-server glusterfs-geo-replication
service glusterd start
#gluster01
gluster volume create volume1
gluster01.example.com:/gluster/volume1
gluster volume start volume1
#gluster02
gluster volume create volume1-replicate
gluster02.example.com:/gluster/volume1-replicate
gluster volume start volume1-replicate
#geo replicate
gluster system:: execute gsec_create
#gluster01
gluster volume geo-replication volume1
gluster02::volume1-replicate create push-pem
gluster volume geo-replication volume1
gluster02::volume1-replicate start
gluster volume geo-replication volume1
gluster02::volume1-replicate status
#mouting and testing
mkdir /mnt/gluster
mount -t glusterfs gluster01.example.com:/volume1
/mnt/gluster
mount -t glusterfs
gluster02.example.com:/volume1-replicate /mnt/gluster
#troubleshooting
gluster volume geo-replication volume1
gluster02::volume1-replicate config log-level DEBUG
service glusterd restart
gluster volume geo-replication volume1
gluster02::volume1-replicate config
There was one step before running
gluster volume geo-replication volume1
gluster02::volume1-replicate create push-pem
I copied the secret.pub to gluster02(the slave) and added
it to .ssh/authorized_keys. I can ssh as root from gluster01
to gluster02 fine.
I'm currently running:
glusterfs-3.7.3-1.el7.x86_64
glusterfs-cli-3.7.3-1.el7.x86_64
glusterfs-libs-3.7.3-1.el7.x86_64
glusterfs-client-xlators-3.7.3-1.el7.x86_64
glusterfs-fuse-3.7.3-1.el7.x86_64
glusterfs-server-3.7.3-1.el7.x86_64
glusterfs-api-3.7.3-1.el7.x86_64
glusterfs-geo-replication-3.7.3-1.el7.x86_64
on both slave and master servers. Both servers have ntp
installed are in sync and patched.
I can mount volume1 or volume1-replicate on each host and
confirmed that iptables have been flushed.
Not sure exactly what else to check at this point. There
appeared to be another user with similar errors but the
mailing list says he resolved it on his own.
Any ideas? I'm completely lost on what could be issue. Some
of the redhat docs mentioned it could be fuse but it looks
like fuse is installed as part of gluster.
Thanks