Re: geo repl status: faulty & errors

Kotresh Hiremath Ravishankar <khiremat@xxxxxxxxxx> · Fri, 3 Feb 2017 00:42:46 -0500 (EST)



Answers inline

Thanks and Regards,
Kotresh H R

----- Original Message -----
> From: "lejeczek" <peljasz@xxxxxxxxxxx>
> To: gluster-users@xxxxxxxxxxx
> Sent: Wednesday, February 1, 2017 5:48:55 PM
> Subject:  geo repl status: faulty & errors
> 
> hi everone,
> 
> trying geo-repl first, I've followed that official howto and the process
> claimed "success" up until I went for status: "Faulty"
> Errors I see:
> ...
> [2017-02-01 12:11:38.103259] I [monitor(monitor):268:monitor] Monitor:
> starting gsyncd worker
> [2017-02-01 12:11:38.342930] I [changelogagent(agent):73:__init__]
> ChangelogAgent: Agent listining...
> [2017-02-01 12:11:38.354500] I
> [gsyncd(/__.aLocalStorages/1/0-GLUSTERs/1GLUSTER-QEMU-VMs):736:main_i]
> <top>: syncing: gluster://localhost:QEMU-VMs ->
> ssh://root@10.5.6.32:gluster://localhost:QEMU-VMs-Replica
> [2017-02-01 12:11:38.581310] E
> [syncdutils(/__.aLocalStorages/1/0-GLUSTERs/1GLUSTER-QEMU-VMs):252:log_raise_exception]
> <top>: connection to peer is broken
> [2017-02-01 12:11:38.581964] E
> [resource(/__.aLocalStorages/1/0-GLUSTERs/1GLUSTER-QEMU-VMs):234:errlog]
> Popen: command "ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no
> -i /var/lib/glusterd/geo-replication/secret.pem -p 22 -oControlMaster=auto
> -S /tmp/gsyncd-aux-ssh-VLX7ff/2bad8986ecbd9ad471c368528e0770f6.sock
> root@10.5.6.32 /nonexistent/gsyncd --session-owner
> 8709782a-daa5-4434-a816-c4e0aef8fef2 -N --listen --timeout 120
> gluster://localhost:QEMU-VMs-Replica" returned with 255, saying:
> [2017-02-01 12:11:38.582236] E
> [resource(/__.aLocalStorages/1/0-GLUSTERs/1GLUSTER-QEMU-VMs):238:logerr]
> Popen: ssh> Permission denied
> (publickey,gssapi-keyex,gssapi-with-mic,password).
> [2017-02-01 12:11:38.582945] I
> [syncdutils(/__.aLocalStorages/1/0-GLUSTERs/1GLUSTER-QEMU-VMs):220:finalize]
> <top>: exiting.
> [2017-02-01 12:11:38.586689] I [repce(agent):92:service_loop] RepceServer:
> terminating on reaching EOF.
> [2017-02-01 12:11:38.587055] I [syncdutils(agent):220:finalize] <top>:
> exiting.
> [2017-02-01 12:11:38.586905] I [monitor(monitor):334:monitor] Monitor:
> worker(/__.aLocalStorages/1/0-GLUSTERs/1GLUSTER-QEMU-VMs) died before
> establishing connection
> 
> It's a bit puzzling as password-less ssh works, I had it before gluster so I
> also tried "create no-verify" just in case.
> This - (/__.aLocalStorages/1/0-GLUSTERs/1GLUSTER-QEMU-VMs) - is master volume
> so I understand it is the slave failing here, right?
        There is no problem with passwordless ssh. Geo-rep uses this passwordless
  SSH connection to distribute geo-rep specific SSH keys to all the slave volumes
  using hook script during setup. You can check the glusterd logs if the hook
  script is failed for some reason.

> This is just one peer of two-peer volume, I'd guess process does not even go
> to the second for the first one fails, thus not in the logs, correct?
  This is log file is w.r.t to this node. It's not one after the other. Please
  check the other master peer node for the errors related to that.
> 

  How to fix:
    1. Please run "gluster vol geo-rep <mastervol> <slavehost>::<slavevol> create push-pem force"
             ---This should fix the issue. If you still fail to root cause the issue, use the
       below tool to setup as it distributes keys synchronously and you can catch error easily.
                        
               http://aravindavk.in/blog/introducing-georepsetup/

> many thanks for all the help,
> L.
> 
> _______________________________________________
> Gluster-users mailing list
> Gluster-users@xxxxxxxxxxx
> http://lists.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users