Dear All,
we are running a dist. repl. volume on 4 nodes including geo-replication
to another location.
the geo-replication was running fine for months.
since 18th jan. the geo-replication is faulty. the geo-rep log on the
master shows following error in a loop while the logs on the slave just
show 'I'nformations...
somehow suspicious are the frequent 'shutting down connection' messages
in the brick log while geo replication is started. they are stopping
right in the moment when the geo replication is stopped.
unfortunately i did not found any hint in the mailing list or elsewhere
to solve this issue.
does anybody have already seen such error or can give me some hints how
to proceed... ?
any help is appreciated.
best regards
Dietmar
[2018-01-19 14:23:20.141123] I [monitor(monitor):267:monitor] Monitor:
------------------------------------------------------------
[2018-01-19 14:23:20.141457] I [monitor(monitor):268:monitor] Monitor:
starting gsyncd worker
[2018-01-19 14:23:20.227952] I [gsyncd(/brick1/mvol1):733:main_i] <top>:
syncing: gluster://localhost:mvol1 ->
ssh://root@gl-slave-01-int:gluster://localhost:svol1
[2018-01-19 14:23:20.235563] I [changelogagent(agent):73:__init__]
ChangelogAgent: Agent listining...
[2018-01-19 14:23:23.55553] I [master(/brick1/mvol1):83:gmaster_builder]
<top>: setting up xsync change detection mode
[2018-01-19 14:23:23.56019] I [master(/brick1/mvol1):367:__init__]
_GMaster: using 'rsync' as the sync engine
[2018-01-19 14:23:23.56989] I [master(/brick1/mvol1):83:gmaster_builder]
<top>: setting up changelog change detection mode
[2018-01-19 14:23:23.57260] I [master(/brick1/mvol1):367:__init__]
_GMaster: using 'rsync' as the sync engine
[2018-01-19 14:23:23.58098] I [master(/brick1/mvol1):83:gmaster_builder]
<top>: setting up changeloghistory change detection mode
[2018-01-19 14:23:23.58454] I [master(/brick1/mvol1):367:__init__]
_GMaster: using 'rsync' as the sync engine
[2018-01-19 14:23:25.123959] I [master(/brick1/mvol1):1249:register]
_GMaster: xsync temp directory:
/var/lib/misc/glusterfsd/mvol1/ssh%3A%2F%2Froot%4082.199.131.135%3Agluster%3A%2F%2F127.0.0.1%3Asvol1/0a6056eb995956f1dc84f32256dae472/xsync
[2018-01-19 14:23:25.124351] I
[resource(/brick1/mvol1):1528:service_loop] GLUSTER: Register time:
1516371805
[2018-01-19 14:23:25.127505] I [master(/brick1/mvol1):510:crawlwrap]
_GMaster: primary master with volume id
2f5de6e4-66de-40a7-9f24-4762aad3ca96 ...
[2018-01-19 14:23:25.130393] I [master(/brick1/mvol1):519:crawlwrap]
_GMaster: crawl interval: 1 seconds
[2018-01-19 14:23:25.134413] I [master(/brick1/mvol1):466:mgmt_lock]
_GMaster: Got lock : /brick1/mvol1 : Becoming ACTIVE
[2018-01-19 14:23:25.136784] I [master(/brick1/mvol1):1163:crawl]
_GMaster: starting history crawl... turns: 1, stime: (1516248272, 0),
etime: 1516371805
[2018-01-19 14:23:25.139033] I [master(/brick1/mvol1):1192:crawl]
_GMaster: slave's time: (1516248272, 0)
[2018-01-19 14:23:27.157931] E [resource(/brick1/mvol1):234:errlog]
Popen: command "rsync -aR0 --inplace --files-from=- --super --stats
--numeric-ids --no-implied-dirs --xattrs --acls . -e ssh
-oPasswordAuthentication=no -oStrictHostKeyChecking=no -i
/var/lib/glusterd/geo-replication/secret.pem -p 22 -oControlMaster=auto
-S /tmp/gsyncd-aux-ssh-o2j6UA/db73a3bfe7357366aff777392fc60a7e.sock
--compress root@gl-slave-01-int:/proc/398/cwd" returned with 3
[2018-01-19 14:23:27.158600] I [syncdutils(/brick1/mvol1):220:finalize]
<top>: exiting.
[2018-01-19 14:23:27.162561] I [repce(agent):92:service_loop]
RepceServer: terminating on reaching EOF.
[2018-01-19 14:23:27.163053] I [syncdutils(agent):220:finalize] <top>:
exiting.
[2018-01-19 14:23:28.61029] I [monitor(monitor):344:monitor] Monitor:
worker(/brick1/mvol1) died in startup phase
/var/log/glusterfs/bricks/brick1-mvol1.log
[2018-01-19 14:23:18.264649] I [login.c:81:gf_auth] 0-auth/login:
allowed user names: 2bc51718-940f-4a9c-9106-eb8404b95622
[2018-01-19 14:23:18.264689] I [MSGID: 115029]
[server-handshake.c:690:server_setvolume] 0-mvol1-server: accepted
client from
gl-master-04-8871-2018/01/19-14:23:18:129523-mvol1-client-0-0-0
(version: 3.7.18)
[2018-01-19 14:23:21.995012] I [login.c:81:gf_auth] 0-auth/login:
allowed user names: 2bc51718-940f-4a9c-9106-eb8404b95622
[2018-01-19 14:23:21.995049] I [MSGID: 115029]
[server-handshake.c:690:server_setvolume] 0-mvol1-server: accepted
client from
gl-master-01-22759-2018/01/19-14:23:21:928705-mvol1-client-0-0-0
(version: 3.7.18)
[2018-01-19 14:23:23.392692] I [MSGID: 115036]
[server.c:552:server_rpc_notify] 0-mvol1-server: disconnecting
connection from
gl-master-04-8871-2018/01/19-14:23:18:129523-mvol1-client-0-0-0
[2018-01-19 14:23:23.392746] I [MSGID: 101055]
[client_t.c:420:gf_client_unref] 0-mvol1-server: Shutting down
connection gl-master-04-8871-2018/01/19-14:23:18:129523-mvol1-client-0-0-0
[2018-01-19 14:23:25.322559] I [login.c:81:gf_auth] 0-auth/login:
allowed user names: 2bc51718-940f-4a9c-9106-eb8404b95622
[2018-01-19 14:23:25.322591] I [MSGID: 115029]
[server-handshake.c:690:server_setvolume] 0-mvol1-server: accepted
client from
gl-master-03-17451-2018/01/19-14:23:25:261540-mvol1-client-0-0-0
(version: 3.7.18)
[2018-01-19 14:23:27.164568] W [socket.c:596:__socket_rwv]
0-mvol1-changelog: readv on
/var/run/gluster/.0a6056eb995956f1dc84f32256dae47222743.sock failed (No
data available)
[2018-01-19 14:23:27.164621] I [MSGID: 101053]
[mem-pool.c:640:mem_pool_destroy] 0-mvol1-changelog: size=588 max=0 total=0
[2018-01-19 14:23:27.164641] I [MSGID: 101053]
[mem-pool.c:640:mem_pool_destroy] 0-mvol1-changelog: size=124 max=0 total=0
[2018-01-19 14:23:27.168989] I [MSGID: 115036]
[server.c:552:server_rpc_notify] 0-mvol1-server: disconnecting
connection from
gl-master-01-22759-2018/01/19-14:23:21:928705-mvol1-client-0-0-0
[2018-01-19 14:23:27.169030] I [MSGID: 101055]
[client_t.c:420:gf_client_unref] 0-mvol1-server: Shutting down
connection gl-master-01-22759-2018/01/19-14:23:21:928705-mvol1-client-0-0-0
[2018-01-19 14:23:28.636402] I [login.c:81:gf_auth] 0-auth/login:
allowed user names: 2bc51718-940f-4a9c-9106-eb8404b95622
[2018-01-19 14:23:28.636443] I [MSGID: 115029]
[server-handshake.c:690:server_setvolume] 0-mvol1-server: accepted
client from
gl-master-02-17275-2018/01/19-14:23:28:429242-mvol1-client-0-0-0
(version: 3.7.18)
[2018-01-19 14:23:31.728022] I [MSGID: 115036]
[server.c:552:server_rpc_notify] 0-mvol1-server: disconnecting
connection from
gl-master-03-17451-2018/01/19-14:23:25:261540-mvol1-client-0-0-0
[2018-01-19 14:23:31.728086] I [MSGID: 101055]
[client_t.c:420:gf_client_unref] 0-mvol1-server: Shutting down
connection gl-master-03-17451-2018/01/19-14:23:25:261540-mvol1-client-0-0-0
on all gluster nodes :
rsync version 3.1.1 protocol version 31
glusterfs 3.7.18
ubuntu 16.04.3
[ 14:22:43 ] - root@gl-master-01 ~/tmp $gluster volume geo-replication
mvol1 gl-slave-01-int::svol1 config
special_sync_mode: partial
gluster_log_file:
/var/log/glusterfs/geo-replication/mvol1/ssh%3A%2F%2Froot%4082.199.131.135%3Agluster%3A%2F%2F127.0.0.1%3Asvol1.gluster.log
ssh_command: ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no
-i /var/lib/glusterd/geo-replication/secret.pem
change_detector: changelog
use_meta_volume: true
session_owner: 2f5de6e4-66de-40a7-9f24-4762aad3ca96
state_file:
/var/lib/glusterd/geo-replication/mvol1_gl-slave-01-int_svol1/monitor.status
gluster_params: aux-gfid-mount acl
remote_gsyncd: /nonexistent/gsyncd
working_dir:
/var/lib/misc/glusterfsd/mvol1/ssh%3A%2F%2Froot%4082.199.131.135%3Agluster%3A%2F%2F127.0.0.1%3Asvol1
state_detail_file:
/var/lib/glusterd/geo-replication/mvol1_gl-slave-01-int_svol1/ssh%3A%2F%2Froot%4082.199.131.135%3Agluster%3A%2F%2F127.0.0.1%3Asvol1-detail.status
gluster_command_dir: /usr/sbin/
pid_file:
/var/lib/glusterd/geo-replication/mvol1_gl-slave-01-int_svol1/monitor.pid
georep_session_working_dir:
/var/lib/glusterd/geo-replication/mvol1_gl-slave-01-int_svol1/
ssh_command_tar: ssh -oPasswordAuthentication=no
-oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/tar_ssh.pem
master.stime_xattr_name:
trusted.glusterfs.2f5de6e4-66de-40a7-9f24-4762aad3ca96.256628ab-57c2-44a4-9367-59e1939ade64.stime
changelog_log_file:
/var/log/glusterfs/geo-replication/mvol1/ssh%3A%2F%2Froot%4082.199.131.135%3Agluster%3A%2F%2F127.0.0.1%3Asvol1-changes.log
socketdir: /var/run/gluster
volume_id: 2f5de6e4-66de-40a7-9f24-4762aad3ca96
ignore_deletes: false
state_socket_unencoded:
/var/lib/glusterd/geo-replication/mvol1_gl-slave-01-int_svol1/ssh%3A%2F%2Froot%4082.199.131.135%3Agluster%3A%2F%2F127.0.0.1%3Asvol1.socket
log_file:
/var/log/glusterfs/geo-replication/mvol1/ssh%3A%2F%2Froot%4082.199.131.135%3Agluster%3A%2F%2F127.0.0.1%3Asvol1.log
[ 14:22:46 ] - root@gl-master-01 ~/tmp $
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users