Hi Kotresh,
I have been testing for a bit and as you can see from the logs I sent before permission is denied for geouser on slave node on file:
/var/log/glusterfs/cli.log
I have turned selinux off and just for testing I changed permissions on /var/log/glusterfs/cli.log so geouser can access it.
Starting geo-replication after that gives response successful but all nodes get status Faulty.
If I run: gluster-mountbroker status
I get:
+-----------------------------+-------------+---------------------------+--------------+--------------------------+
| NODE | NODE STATUS | MOUNT ROOT | GROUP | USERS |
+-----------------------------+-------------+---------------------------+--------------+--------------------------+
| urd-gds-geo-001.hgen.slu.se | UP | /var/mountbroker-root(OK) | geogroup(OK) | geouser(urd-gds-volume) |
| urd-gds-geo-002 | UP | /var/mountbroker-root(OK) | geogroup(OK) | geouser(urd-gds-volume) |
| localhost | UP | /var/mountbroker-root(OK) | geogroup(OK) | geouser(urd-gds-volume) |
+-----------------------------+-------------+---------------------------+--------------+--------------------------+
and that is all nodes on slave cluster, so mountbroker seems ok.
gsyncd.log logs an error about /usr/local/sbin/gluster is missing.
That is correct cos gluster is in /sbin/gluster and /urs/sbin/gluster
Another error is that SSH between master and slave is broken,
but now when I have changed permission on /var/log/glusterfs/cli.log I can run:
ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem -p 22 geouser@urd-gds-geo-001 gluster --xml --remote-host=localhost volume info urd-gds-volume
as geouser and that works, which means that the ssh connection works.
Is the permissions on /var/log/glusterfs/cli.log changed when geo-replication is setup?
Is gluster supposed to be in /usr/local/sbin/gluster?
Do I have any options or should I remove current geo-replication and create a new?
How much do I need to clean up before creating a new geo-replication?
In that case can I pause geo-replication, mount slave cluster on master cluster and run rsync , just to speed up transfer of files?
Many thanks in advance!
Marcus Pedersén
Part from the gsyncd.log:
[2018-07-16 19:34:56.26287] E [syncdutils(worker /urd-gds/gluster):749:errlog] Popen: command returned error cmd=ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replicatio\
n/secret.pem -p 22 -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-WrbZ22/bf60c68f1a195dad59573a8dbaa309f2.sock geouser@urd-gds-geo-001 /nonexistent/gsyncd slave urd-gds-volume geouser@urd-gds-geo-001::urd-gds-volu\
me --master-node urd-gds-001 --master-node-id 912bebfd-1a7f-44dc-b0b7-f001a20d58cd --master-brick /urd-gds/gluster --local-node urd-gds-geo-000 --local-node-id 03075698-2bbf-43e4-a99a-65fe82f61794 --slave-timeo\
ut 120 --slave-log-level INFO --slave-gluster-log-level INFO --slave-gluster-command-dir /usr/local/sbin/ error=1
[2018-07-16 19:34:56.26583] E [syncdutils(worker /urd-gds/gluster):753:logerr] Popen: ssh> failure: execution of "/usr/local/sbin/gluster" failed with ENOENT (No such file or directory)
[2018-07-16 19:34:56.33901] I [repce(agent /urd-gds/gluster):89:service_loop] RepceServer: terminating on reaching EOF.
[2018-07-16 19:34:56.34307] I [monitor(monitor):262:monitor] Monitor: worker died before establishing connection brick=/urd-gds/gluster
[2018-07-16 19:35:06.59412] I [monitor(monitor):158:monitor] Monitor: starting gsyncd worker brick=/urd-gds/gluster slave_node=urd-gds-geo-000
[2018-07-16 19:35:06.99509] I [gsyncd(worker /urd-gds/gluster):297:main] <top>: Using session config file path=/var/lib/glusterd/geo-replication/urd-gds-volume_urd-gds-geo-001_urd-gds-volume/gsyncd.conf
[2018-07-16 19:35:06.99561] I [gsyncd(agent /urd-gds/gluster):297:main] <top>: Using session config file path=/var/lib/glusterd/geo-replication/urd-gds-volume_urd-gds-geo-001_urd-gds-volume/gsyncd.conf
[2018-07-16 19:35:06.100481] I [changelogagent(agent /urd-gds/gluster):72:__init__] ChangelogAgent: Agent listining...
[2018-07-16 19:35:06.108834] I [resource(worker /urd-gds/gluster):1348:connect_remote] SSH: Initializing SSH connection between master and slave...
[2018-07-16 19:35:06.762320] E [syncdutils(worker /urd-gds/gluster):303:log_raise_exception] <top>: connection to peer is broken
[2018-07-16 19:35:06.763103] E [syncdutils(worker /urd-gds/gluster):749:errlog] Popen: command returned error cmd=ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replicatio\
n/secret.pem -p 22 -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-K9mB6Q/bf60c68f1a195dad59573a8dbaa309f2.sock geouser@urd-gds-geo-001 /nonexistent/gsyncd slave urd-gds-volume geouser@urd-gds-geo-001::urd-gds-volu\
me --master-node urd-gds-001 --master-node-id 912bebfd-1a7f-44dc-b0b7-f001a20d58cd --master-brick /urd-gds/gluster --local-node urd-gds-geo-000 --local-node-id 03075698-2bbf-43e4-a99a-65fe82f61794 --slave-timeo\
ut 120 --slave-log-level INFO --slave-gluster-log-level INFO --slave-gluster-command-dir /usr/local/sbin/ error=1
[2018-07-16 19:35:06.763398] E [syncdutils(worker /urd-gds/gluster):753:logerr] Popen: ssh> failure: execution of "/usr/local/sbin/gluster" failed with ENOENT (No such file or directory)
[2018-07-16 19:35:06.771905] I [repce(agent /urd-gds/gluster):89:service_loop] RepceServer: terminating on reaching EOF.
[2018-07-16 19:35:06.772272] I [monitor(monitor):262:monitor] Monitor: worker died before establishing connection brick=/urd-gds/gluster
[2018-07-16 19:35:16.786387] I [monitor(monitor):158:monitor] Monitor: starting gsyncd worker brick=/urd-gds/gluster slave_node=urd-gds-geo-000
[2018-07-16 19:35:16.828056] I [gsyncd(worker /urd-gds/gluster):297:main] <top>: Using session config file path=/var/lib/glusterd/geo-replication/urd-gds-volume_urd-gds-geo-001_urd-gds-volume/gsyncd.conf
[2018-07-16 19:35:16.828066] I [gsyncd(agent /urd-gds/gluster):297:main] <top>: Using session config file path=/var/lib/glusterd/geo-replication/urd-gds-volume_urd-gds-geo-001_urd-gds-volume/gsyncd.conf
[2018-07-16 19:35:16.828912] I [changelogagent(agent /urd-gds/gluster):72:__init__] ChangelogAgent: Agent listining...
[2018-07-16 19:35:16.837100] I [resource(worker /urd-gds/gluster):1348:connect_remote] SSH: Initializing SSH connection between master and slave...
[2018-07-16 19:35:17.260257] E [syncdutils(worker /urd-gds/gluster):303:log_raise_exception] <top>: connection to peer is broken
Från: gluster-users-bounces@xxxxxxxxxxx <gluster-users-bounces@xxxxxxxxxxx> för Marcus Pedersén <marcus.pedersen@xxxxxx>
Skickat: den 13 juli 2018 14:50
Till: Kotresh Hiremath Ravishankar
Kopia: gluster-users@xxxxxxxxxxx
Ämne: Re: Upgrade to 4.1.1 geo-replication does not work
Hi Kotresh,
Yes, all nodes have the same version 4.1.1 both master and slave.
All glusterd are crashing on the master side.
Will send logs tonight.
Thanks,
Marcus
################
Marcus Pedersén
Systemadministrator
Interbull Centre
################
Sent from my phone
################
---
När du skickar e-post till SLU så innebär detta att SLU behandlar dina personuppgifter. För att läsa mer om hur detta går till, klicka
här
E-mailing SLU will result in SLU processing your personal data. For more information on how this is done, click
here
---
När du skickar e-post till SLU så innebär detta att SLU behandlar dina personuppgifter. För att läsa mer om hur detta går till, klicka
här
E-mailing SLU will result in SLU processing your personal data. For more information on how this is done, click
here