I am testing a gluster geo-replication setup in
glusterfs 3.12.14 version on CentOS Linux release
7.5.1804 and getting a faulty session due to rsync. It
returns error 3.
After I start the session, it goes from
initializing, then to active and finally to faulty.
Here is what I can see in logs.
cat
/var/log/glusterfs/geo-replication/mastervol/ssh%3A%2F%2Fgeoaccount%4010.0.2.13%3Agluster%3A%2F%2F127.0.0.1%3Aslavevol.log
[2018-10-06 08:55:02.246958] I
[monitor(monitor):280:monitor] Monitor: starting
gsyncd worker brick=/bricks/brick-a1/brick
slave_node=
ssh://geoaccount@servere:gluster://localhost:slavevol
[2018-10-06 08:55:02.503489] I
[resource(/bricks/brick-a1/brick):1780:connect_remote]
SSH: Initializing SSH connection between master and
slave...
[2018-10-06 08:55:02.515492] I
[changelogagent(/bricks/brick-a1/brick):73:__init__]
ChangelogAgent: Agent listining...
[2018-10-06 08:55:04.571449] I
[resource(/bricks/brick-a1/brick):1787:connect_remote]
SSH: SSH connection between master and slave
established. duration=2.0676
[2018-10-06 08:55:04.571890] I
[resource(/bricks/brick-a1/brick):1502:connect]
GLUSTER: Mounting gluster volume locally...
[2018-10-06 08:55:05.693440] I
[resource(/bricks/brick-a1/brick):1515:connect]
GLUSTER: Mounted gluster volume duration=1.1212
[2018-10-06 08:55:05.693741] I
[gsyncd(/bricks/brick-a1/brick):799:main_i]
<top>: Closing feedback fd, waking up the
monitor
[2018-10-06 08:55:07.711970] I
[master(/bricks/brick-a1/brick):1518:register]
_GMaster: Working dir
path=/var/lib/misc/glusterfsd/mastervol/ssh%3A%2F%2Fgeoaccount%4010.0.2.13%3Agluster%3A%2F%2F127.0.0.1%3Aslavevol/9517ac67e25c7491f03ba5e2506505bd
[2018-10-06 08:55:07.712357] I
[resource(/bricks/brick-a1/brick):1662:service_loop]
GLUSTER: Register time time=1538816107
[2018-10-06 08:55:07.764151] I
[master(/bricks/brick-a1/brick):490:mgmt_lock]
_GMaster: Got lock Becoming ACTIVE
brick=/bricks/brick-a1/brick
[2018-10-06 08:55:07.768949] I
[gsyncdstatus(/bricks/brick-a1/brick):276:set_active]
GeorepStatus: Worker Status Change status=Active
[2018-10-06 08:55:07.770529] I
[gsyncdstatus(/bricks/brick-a1/brick):248:set_worker_crawl_status]
GeorepStatus: Crawl Status Changestatus=History Crawl
[2018-10-06 08:55:07.770975] I
[master(/bricks/brick-a1/brick):1432:crawl] _GMaster:
starting history crawl turns=1 stime=(1538745843,
0) entry_stime=None etime=1538816107
[2018-10-06 08:55:08.773402] I
[master(/bricks/brick-a1/brick):1461:crawl] _GMaster:
slave's time stime=(1538745843, 0)
[2018-10-06 08:55:09.262964] I
[master(/bricks/brick-a1/brick):1863:syncjob] Syncer:
Sync Time Taken duration=0.0606 num_files=1job=2
return_code=3
[2018-10-06 08:55:09.263253] E
[resource(/bricks/brick-a1/brick):210:errlog] Popen:
command returned error cmd=rsync -aR0 --inplace
--files-from=- --super --stats --numeric-ids
--no-implied-dirs --existing --xattrs --acls
--ignore-missing-args . -e ssh
-oPasswordAuthentication=no -oStrictHostKeyChecking=no
-i /var/lib/glusterd/geo-replication/secret.pem -p 22
-oControlMaster=auto -S
/tmp/gsyncd-aux-ssh-wVbxGU/05b8d7b5dab75575689c0e1a2ec33b3f.sock
--compress geoaccount@servere:/proc/12335/cwd
error=3
[2018-10-06 08:55:09.275593] I
[syncdutils(/bricks/brick-a1/brick):271:finalize]
<top>: exiting.
[2018-10-06 08:55:09.279442] I
[repce(/bricks/brick-a1/brick):92:service_loop]
RepceServer: terminating on reaching EOF.
[2018-10-06 08:55:09.279936] I
[syncdutils(/bricks/brick-a1/brick):271:finalize]
<top>: exiting.
[2018-10-06 08:55:09.698153] I
[monitor(monitor):363:monitor] Monitor: worker died in
startup phase brick=/bricks/brick-a1/brick
[2018-10-06 08:55:09.707330] I
[gsyncdstatus(monitor):243:set_worker_status]
GeorepStatus: Worker Status Change status=Faulty
[2018-10-06 08:55:19.888017] I
[monitor(monitor):280:monitor] Monitor: starting
gsyncd worker brick=/bricks/brick-a1/brick
slave_node=
ssh://geoaccount@servere:gluster://localhost:slavevol
[2018-10-06 08:55:20.140819] I
[resource(/bricks/brick-a1/brick):1780:connect_remote]
SSH: Initializing SSH connection between master and
slave...
[2018-10-06 08:55:20.141815] I
[changelogagent(/bricks/brick-a1/brick):73:__init__]
ChangelogAgent: Agent listining...
[2018-10-06 08:55:22.245625] I
[resource(/bricks/brick-a1/brick):1787:connect_remote]
SSH: SSH connection between master and slave
established. duration=2.1046
[2018-10-06 08:55:22.246062] I
[resource(/bricks/brick-a1/brick):1502:connect]
GLUSTER: Mounting gluster volume locally...
[2018-10-06 08:55:23.370100] I
[resource(/bricks/brick-a1/brick):1515:connect]
GLUSTER: Mounted gluster volume duration=1.1238
[2018-10-06 08:55:23.370507] I
[gsyncd(/bricks/brick-a1/brick):799:main_i]
<top>: Closing feedback fd, waking up the
monitor
[2018-10-06 08:55:25.388721] I
[master(/bricks/brick-a1/brick):1518:register]
_GMaster: Working dir
path=/var/lib/misc/glusterfsd/mastervol/ssh%3A%2F%2Fgeoaccount%4010.0.2.13%3Agluster%3A%2F%2F127.0.0.1%3Aslavevol/9517ac67e25c7491f03ba5e2506505bd
[2018-10-06 08:55:25.388978] I
[resource(/bricks/brick-a1/brick):1662:service_loop]
GLUSTER: Register time time=1538816125
[2018-10-06 08:55:25.405546] I
[master(/bricks/brick-a1/brick):490:mgmt_lock]
_GMaster: Got lock Becoming ACTIVE
brick=/bricks/brick-a1/brick
[2018-10-06 08:55:25.408958] I
[gsyncdstatus(/bricks/brick-a1/brick):276:set_active]
GeorepStatus: Worker Status Change status=Active
[2018-10-06 08:55:25.410522] I
[gsyncdstatus(/bricks/brick-a1/brick):248:set_worker_crawl_status]
GeorepStatus: Crawl Status Changestatus=History Crawl
[2018-10-06 08:55:25.411005] I
[master(/bricks/brick-a1/brick):1432:crawl] _GMaster:
starting history crawl turns=1 stime=(1538745843,
0) entry_stime=None etime=1538816125
[2018-10-06 08:55:26.413892] I
[master(/bricks/brick-a1/brick):1461:crawl] _GMaster:
slave's time stime=(1538745843, 0)
[2018-10-06 08:55:26.933149] I
[master(/bricks/brick-a1/brick):1863:syncjob] Syncer:
Sync Time Taken duration=0.0549 num_files=1job=3
return_code=3
[2018-10-06 08:55:26.933419] E
[resource(/bricks/brick-a1/brick):210:errlog] Popen:
command returned error cmd=rsync -aR0 --inplace
--files-from=- --super --stats --numeric-ids
--no-implied-dirs --existing --xattrs --acls
--ignore-missing-args . -e ssh
-oPasswordAuthentication=no -oStrictHostKeyChecking=no
-i /var/lib/glusterd/geo-replication/secret.pem -p 22
-oControlMaster=auto -S
/tmp/gsyncd-aux-ssh-Oq_aPL/05b8d7b5dab75575689c0e1a2ec33b3f.sock
--compress geoaccount@servere:/proc/12489/cwd
error=3
[2018-10-06 08:55:26.953044] I
[syncdutils(/bricks/brick-a1/brick):271:finalize]
<top>: exiting.
[2018-10-06 08:55:26.956691] I
[repce(/bricks/brick-a1/brick):92:service_loop]
RepceServer: terminating on reaching EOF.
[2018-10-06 08:55:26.957233] I
[syncdutils(/bricks/brick-a1/brick):271:finalize]
<top>: exiting.
[2018-10-06 08:55:27.378103] I
[monitor(monitor):363:monitor] Monitor: worker died in
startup phase brick=/bricks/brick-a1/brick
[2018-10-06 08:55:27.382554] I
[gsyncdstatus(monitor):243:set_worker_status]
GeorepStatus: Worker Status Change status=Faulty
[root@servera ~]#