Geo-replication error=12 (rsync)

Stefan Kania <stefan@xxxxxxxxxxxxxxx> · Wed, 12 Dec 2018 14:53:55 +0100

Hello,

I have configured geo-replication with a non-privileged user, I used
this documentation:
https://docs.gluster.org/en/latest/Administrator%20Guide/Geo%20Replication/

My setup:
(Master)
2-Node Gluster replicated volume with Debian 9 and gluster 5.1 (from
gluster.org)

(Slave)
2-Node Gluster replicated volume for geo-replication Debian 9 and
gluster 5.1 (from gluster.org)

rsync and ssh are from the distribution on all hosts

As soon as I start geo-replication I got the following message:
--------------
root@master-01:~# gluster v geo-replication gv0 geouser@slave-01::geo start
Starting geo-replication session between gv0 & geouser@slave-01::geo has
been successful
--------------

But then, if i check the status I see:
--------------
root@master-01:~# gluster v geo-replication gv0 geouser@slave-01::geo status

MASTER NODE    MASTER VOL    MASTER BRICK      SLAVE USER    SLAVE
              SLAVE NODE    STATUS     CRAWL STATUS       LAST_SYNCED

----------------------------------------------------------------------------------------------------------------------------------------------------
master-01      gv0           /gluster/brick    geouser
geouser@slave-01::geo    slave-02      Active     Changelog Crawl
2018-12-12 14:08:16
master-02      gv0           /gluster/brick    geouser
geouser@slave-01::geo    slave-02      Passive    N/A                N/A

root@master-01:~# gluster v geo-replication gv0 geouser@slave-01::geo status

MASTER NODE    MASTER VOL    MASTER BRICK      SLAVE USER    SLAVE
              SLAVE NODE    STATUS     CRAWL STATUS    LAST_SYNCED
-----------------------------------------------------------------------------------------------------------------------------------------
master-01      gv0           /gluster/brick    geouser
geouser@slave-01::geo    N/A           Faulty     N/A             N/A

master-02      gv0           /gluster/brick    geouser
geouser@slave-01::geo    slave-02      Passive    N/A             N/A
--------------
And so on and so on ....
The status always switches between Active --> Faulty --> Initialiting...
--> Activ .......

I took a look in the log-file on the master and found this message:
-----------------
[2018-12-12 13:28:52.187111] I
[gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker
Status Change status=Initializing...
[2018-12-12 13:28:52.187377] I [monitor(monitor):157:monitor] Monitor:
starting gsyncd worker   slave_node=slave-02     brick=/gluster/brick
[2018-12-12 13:28:52.271593] I [gsyncd(agent /gluster/brick):308:main]
<top>: Using session config file
path=/var/lib/glusterd/geo-replication/gv0_slave-01_geo/gsyncd.conf
[2018-12-12 13:28:52.275005] I [gsyncd(worker /gluster/brick):308:main]
<top>: Using session config file
path=/var/lib/glusterd/geo-replication/gv0_slave-01_geo/gsyncd.conf
[2018-12-12 13:28:52.281762] I [changelogagent(agent
/gluster/brick):72:__init__] ChangelogAgent: Agent listining...
[2018-12-12 13:28:52.282340] I [resource(worker
/gluster/brick):1401:connect_remote] SSH: Initializing SSH connection
between master and slave...
[2018-12-12 13:28:53.621534] I [resource(worker
/gluster/brick):1448:connect_remote] SSH: SSH connection between master
and slave established.  duration=1.3390
[2018-12-12 13:28:53.621803] I [resource(worker
/gluster/brick):1120:connect] GLUSTER: Mounting gluster volume locally...
[2018-12-12 13:28:54.637206] I [resource(worker
/gluster/brick):1143:connect] GLUSTER: Mounted gluster volume
duration=1.0152
[2018-12-12 13:28:54.637604] I [subcmds(worker
/gluster/brick):80:subcmd_worker] <top>: Worker spawn successful.
Acknowledging back to monitor
[2018-12-12 13:28:56.646581] I [master(worker
/gluster/brick):1603:register] _GMaster: Working dir
path=/var/lib/misc/gluster/gsyncd/gv0_slave-01_geo/gluster-brick
[2018-12-12 13:28:56.647150] I [resource(worker
/gluster/brick):1306:service_loop] GLUSTER: Register time
time=1544621336
[2018-12-12 13:28:56.663329] I [gsyncdstatus(worker
/gluster/brick):281:set_active] GeorepStatus: Worker Status Change
status=Active
[2018-12-12 13:28:56.714960] I [gsyncdstatus(worker
/gluster/brick):253:set_worker_crawl_status] GeorepStatus: Crawl Status
Change      status=History Crawl
[2018-12-12 13:28:56.715329] I [master(worker
/gluster/brick):1517:crawl] _GMaster: starting history crawl
turns=1 entry_stime=None        stime=(1544620096, 0)   etime=1544621336
[2018-12-12 13:28:57.717641] I [master(worker
/gluster/brick):1546:crawl] _GMaster: slave's time
stime=(1544620096, 0)
[2018-12-12 13:28:58.193830] I [master(worker
/gluster/brick):1954:syncjob] Syncer: Sync Time Taken     job=2
duration=0.0457 num_files=1     return_code=12
[2018-12-12 13:28:58.194150] E [syncdutils(worker
/gluster/brick):809:errlog] Popen: command returned error     cmd=rsync
-aR0 --inplace --files-from=- --super --stats --numeric-ids
--no-implied-dirs --existing --xattrs --acls --ignore-missing-args . -e
ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i
/var/lib/glusterd/geo-replication/secret.pem -p 22 -oControlMaster=auto
-S /tmp/gsyncd-aux-ssh-bte8zsl6/ecd4cc061ecc1734ea1c9d1a5332f983.sock
geouser@slave-01:/proc/6186/cwd   error=12
[2018-12-12 13:28:58.197854] I [repce(agent
/gluster/brick):97:service_loop] RepceServer: terminating on reaching EOF.
[2018-12-12 13:28:58.644061] I [monitor(monitor):278:monitor] Monitor:
worker died in startup phase     brick=/gluster/brick
[2018-12-12 13:28:58.713629] I
[gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker
Status Change status=Faulty
-----------------
This massage is repeating

Here the error-message I think is the problem:
-----------------
[2018-12-12 13:28:58.194150] E [syncdutils(worker
/gluster/brick):809:errlog] Popen: command returned error     cmd=rsync
-aR0 --inplace --files-from=- --super --stats --numeric-ids
--no-implied-dirs --existing --xattrs --acls --ignore-missing-args . -e
ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i
/var/lib/glusterd/geo-replication/secret.pem -p 22 -oControlMaster=auto
-S /tmp/gsyncd-aux-ssh-bte8zsl6/ecd4cc061ecc1734ea1c9d1a5332f983.sock
geouser@slave-01:/proc/6186/cwd   error=12
-----------------

On the slave:
-----------------
root@slave-01:~# tail -f /var/log/auth.log

Dec 12 14:42:08 slave-01 sshd[8869]: Did not receive identification
string from 192.168.80.81 port 60496
Dec 12 14:42:08 slave-01 sshd[8870]: Accepted publickey for geouser from
192.168.80.81 port 60498 ssh2: RSA
SHA256:tnCjr2rrPp+FaYLnxKSaCSxbQ0YJtEDtSWNi5m2jDQ8
Dec 12 14:42:08 slave-01 sshd[8870]: pam_unix(sshd:session): session
opened for user geouser by (uid=0)
Dec 12 14:42:09 slave-01 systemd-logind[437]: New session 174 of user
geouser.
Dec 12 14:42:14 slave-01 sshd[8876]: Received disconnect from
192.168.80.81 port 60498:11: disconnected by user
Dec 12 14:42:14 slave-01 sshd[8876]: Disconnected from 192.168.80.81
port 60498
Dec 12 14:42:14 slave-01 sshd[8870]: pam_unix(sshd:session): session
closed for user geouser
Dec 12 14:42:14 slave-01 systemd-logind[437]: Removed session 174.
-------------------

and

-----------------
root@slave-01:~# tail -f
/var/log/glusterfs/geo-replication-slaves/gv0_slave-01_geo/gsyncd.log

[2018-12-12 13:43:15.576101] W [gsyncd(slave
master-01/gluster/brick):304:main] <top>: Session config file not
exists, using the default config
path=/var/lib/glusterd/geo-replication/gv0_slave-01_geo/gsyncd.conf
[2018-12-12 13:43:15.596193] I [resource(slave
master-01/gluster/brick):1120:connect] GLUSTER: Mounting gluster volume
locally...
[2018-12-12 13:43:16.706221] I [resource(slave
master-01/gluster/brick):1143:connect] GLUSTER: Mounted gluster volume
 duration=1.1099
[2018-12-12 13:43:16.706489] I [resource(slave
master-01/gluster/brick):1170:service_loop] GLUSTER: slave listening
[2018-12-12 13:43:21.278398] I [repce(slave
master-01/gluster/brick):97:service_loop] RepceServer: terminating on
reaching EOF.
-------------------
This message is repeating

I took a look and the file
/var/lib/glusterd/geo-replication/gv0_slave-01_geo/gsyncd.conf is not
existing. How do I get the file?

I already restarted all 4 Nodes (master and Slave) but it's still the same.

any help would be fine

Stefan

Attachment:
signature.asc

Description: OpenPGP digital signature
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users