More informations here:
I update the state of the peer in the uid file located in /v/l/g/peers from state 10 to state 3 (as it is on other node) and now the node is in cluster.
gluster system:: execute gsec_create now create a proper file from master node with every node’s key in it.
Now from there I try to create my georeplication between master nodeB and slaveA
gluster vol geo myvol slave::myvol create push-pem force
From slaveA I got this error message logs:
[2015-02-02 17:19:04.754809] E [glusterd-geo-rep.c:1686:glusterd_op_stage_copy_file] 0-: Op Version not supported.
[2015-02-02 17:19:04.754890] E [glusterd-syncop.c:912:gd_stage_op_phase] 0-management: Staging of operation 'Volume Copy File' failed on localhost : One or more nodes do not support the required op version.
[2015-02-02 17:19:07.513547] E [glusterd-geo-rep.c:1620:glusterd_op_stage_sys_exec] 0-: Op Version not supported.
[2015-02-02 17:19:07.513632] E [glusterd-geo-rep.c:1658:glusterd_op_stage_sys_exec] 0-: One or more nodes do not support the required op version.
[2015-02-02 17:19:07.513660] E [glusterd-syncop.c:912:gd_stage_op_phase] 0-management: Staging of operation 'Volume Execute system commands' failed on localhost : One or more nodes do not support the required op version.
On slaveA I have the common pem file transfered in /v/l/g/geo/ with my 3 nodes from source site.
But the /root/.ssh/authorized_keys is not populated with this file.
From the log I saw that there is a call to a script
/var/lib/glusterd/hooks/1/gsync-create/post/S56glusterd-geo-rep-create-post.sh —volname=myvol is_push_pem=1 pub_file=/var/lib/glusterd/geo-replication/common_secret.pem.pub slave_ip=salveA
In this script the following is done:
```
scp $pub_file $slave_ip:$pub_file_tmp
ssh $slave_ip "mv $pub_file_tmp $pub_file"
ssh $slave_ip "gluster system:: copy file /geo-replication/common_secret.pem.pub > /dev/null"
ssh $slave_ip “gluster system:: execute add_secret_pub > /dev/null"
```
The first 2 lines passed, the third fail so the fourth is never executed.
Third command on slaveA
#gluster system:: copy file /geo-replication/common_secret.pem.pub
One or more nodes do not support the required op version.
# gluster peer status
Number of Peers: 0
from logs:
==> /var/log/glusterfs/etc-glusterfs-glusterd.vol.log <==
[2015-02-02 17:43:29.242524] E [glusterd-geo-rep.c:1686:glusterd_op_stage_copy_file] 0-: Op Version not supported.
[2015-02-02 17:43:29.242610] E [glusterd-syncop.c:912:gd_stage_op_phase] 0-management: Staging of operation 'Volume Copy File' failed on localhost : One or more nodes do not support the required op version.
One or more nodes do not support the required op version.
I have for now only one node on my remote site.
Any way, as this step is done to copy the file accros all the cluster member I can deal without
The fourth command is not working:
[root@slaveA geo-replication]# gluster system:: execute add_secret_pub
[2015-02-02 17:44:49.123326] I [socket.c:3561:socket_init] 0-glusterfs: SSL support is NOT enabled
[2015-02-02 17:44:49.123381] I [socket.c:3576:socket_init] 0-glusterfs: using system polling thread
[2015-02-02 17:44:49.123568] I [socket.c:3561:socket_init] 0-glusterfs: SSL support is NOT enabled
[2015-02-02 17:44:49.123588] I [socket.c:3576:socket_init] 0-glusterfs: using system polling thread
[2015-02-02 17:44:49.306482] I [socket.c:2238:socket_event_handler] 0-transport: disconnecting now
==> /var/log/glusterfs/etc-glusterfs-glusterd.vol.log <==
[2015-02-02 17:44:49.307921] E [glusterd-geo-rep.c:1620:glusterd_op_stage_sys_exec] 0-: Op Version not supported.
[2015-02-02 17:44:49.308009] E [glusterd-geo-rep.c:1658:glusterd_op_stage_sys_exec] 0-: One or more nodes do not support the required op version.
[2015-02-02 17:44:49.308038] E [glusterd-syncop.c:912:gd_stage_op_phase] 0-management: Staging of operation 'Volume Execute system commands' failed on localhost : One or more nodes do not support the required op version.
One or more nodes do not support the required op version.
==> /var/log/glusterfs/cli.log <==
[2015-02-02 17:44:49.308493] I [input.c:36:cli_batch] 0-: Exiting with: -1
I have only one node… I don’t understand the meaning of the errror: One or more nodes do not support the required op version.
--
Cyril Peponnet
Every node is connected:
root@nodeA geo-replication]# gluster peer status
Number of Peers: 2
Hostname: nodeB
Uuid: 6a9da7fc-70ec-4302-8152-0e61929a7c8b
State: Peer in Cluster (Connected)
Hostname: nodeC
Uuid: c12353b5-f41a-4911-9329-fee6a8d529de
State: Peer in Cluster (Connected)
[root@nodeB ~]# gluster peer status
Number of Peers: 2
Hostname: nodeC
Uuid: c12353b5-f41a-4911-9329-fee6a8d529de
State: Peer in Cluster (Connected)
Hostname: nodeA
Uuid: 2ac172bb-a2d0-44f1-9e09-6b054dbf8980
State: Peer is connected and Accepted (Connected)
[root@nodeC geo-replication]# gluster peer status
Number of Peers: 2
Hostname: nodeA
Uuid: 2ac172bb-a2d0-44f1-9e09-6b054dbf8980
State: Peer in Cluster (Connected)
Hostname: nodeB
Uuid: 6a9da7fc-70ec-4302-8152-0e61929a7c8b
State: Peer in Cluster (Connected)
The only difference is State: Peer is connected and Accepted (Connected)
from nodeB about nodeA
When I execute gluster system from node A or C, I have the 3 nodes keys in common pem file. But from nodeB, I only have keys for nodeB and node C. This is infortunate as I try to launch the georeplication job from nodeB (master).
--
Cyril Peponnet
Looks
like node C is in diconnected state. Please let us know the output of `gluster peer status` from all the master nodes and slave nodes.
--
regards
Aravinda
On
01/22/2015 12:27 AM, PEPONNET, Cyril N (Cyril) wrote:
So,
On master node of my 3 node setup:
1) gluster system:: execute gsec_create
in /var/lib/glusterd/geo-replication/common_secret.pub I have pem pub key from master node A and node B (not node C).
On node C in don’t have anything in /v/l/g/geo/ except the gsync template config.
So here I have an issue.
The only error I saw on node C is:
[2015-01-21 18:36:41.179601] E [rpc-clnt.c:208:call_bail]
0-management: bailing out frame type(Peer mgmt) op(—(2)) xid =
0x23 sent = 2015-01-21 18:26:33.031937. timeout = 600 for
xx.xx.xx.xx:24007
On node A, the cli.log looks like:
[2015-01-21 18:49:49.878905] I [socket.c:3561:socket_init]
0-glusterfs: SSL support is NOT enabled
[2015-01-21 18:49:49.878947] I [socket.c:3576:socket_init]
0-glusterfs: using system polling thread
[2015-01-21 18:49:49.879085] I [socket.c:3561:socket_init]
0-glusterfs: SSL support is NOT enabled
[2015-01-21 18:49:49.879095] I [socket.c:3576:socket_init]
0-glusterfs: using system polling thread
[2015-01-21 18:49:49.951835] I
[socket.c:2238:socket_event_handler] 0-transport: disconnecting now
[2015-01-21 18:49:49.972143] I [input.c:36:cli_batch] 0-: Exiting
with: 0
If I run gluster system:: execute gsec_create on node C or node B, the common pem key file contains my 3 nodes pem puk keys. So in some way node A is unable to get the key from node C.
So let’s try to fix this one before going further.
--
Cyril Peponnet
On Jan 20, 2015, at 9:38 PM, Aravinda <avishwan@xxxxxxxxxx <mailto:avishwan@xxxxxxxxxx>>
wrote:
On 01/20/2015 11:01 PM, PEPONNET, Cyril N (Cyril) wrote:
Hi,
I’m ready for new testing, I delete the geo-rep session between master and slace, remove the lines in authorized keys file on slave.
I also remove the common secret pem from slave, and from master. There is only the gsyncd_template.conf in /var/lib/gluster now.
Here is our setup:
Site A: gluster 3 nodes
Site B: gluster 1 node (for now, a second will come).
I can issue
gluster systen:: execute gsec_create
what to check?
common_secret.pem.pub is created in /var/lib/glusterd/geo-replication/common_secret.pem.pub, which should contain public keys from all Master nodes(Site A). Should match with contents of /var/lib/glusterd/geo-replication/secret.pem.pub and /var/lib/glusterd/geo-replication/tar_ssh.pem.pub.
then
gluster geo vol geo_test slave::geo_test create push-pem force (force is needed because the slave vol is smaller than the master vol).
What to check ?
Check for any errors in, /var/log/glusterfs/etc-glusterfs-glusterd.vol.log in rpm installation or in /var/log/glusterfs/usr-local-etc-glusterfs-glusterd.vol.log if source installation. In case of any errors related to hook execution, run directly the hook command
copied from the log. From your previous mail I understand their is some issue while executing hook script. I will look into the issue in hook script.
I want to use change_detector changelog and not rsync btw.
change_detector is crawling mecanism. Available options are: changelog and xsync. xsync is FS Crawl.
sync mecanisms available are: rsync and tarssh.
Can you guide me to setup this but also debug why it’s not working out of the box ?
If needed I can get in touch with you through IRC.
Sure. IRC nickname is aravindavk.
Thanks for your help.
--
regards
Aravinda
http://aravindavk.in
|