Re: geo-replication unprivileged user error

Gmail <b.s.mikhael@xxxxxxxxx> · Thu, 31 Mar 2016 12:09:46 -0700

Thanks Aravinda!
The problem was in duplicate keys in authorized_keys file, one with prefix “command=“ the other one is exactly the same key but with prefix ssh-rsa.
I’ve removed the one with prefix ssh-rsa, and the session is now working fine :D

I’ll do some failure tests then I’ll update you with the results.

—Bishoy

On Mar 31, 2016, at 1:22 AM, Aravinda <avishwan@xxxxxxxxxx> wrote:

    Hi,

    From the error I understood that SSH connection is failing. In
    slave-host02 extra entries present in
    /home/guser/.ssh/authorized_keys.

    In /home/guser/.ssh/authorized_keys Please delete extra lines which
    does not start with "command=". Then stop and start the
    Geo-replication.

    regards
Aravinda
    On 03/31/2016 04:00 AM, Gmail wrote:

      I’ve rebuilt the cluster again, making a fresh installation. And
      now the error is different.

          MASTER NODE               
            MASTER VOL    MASTER BRICK              SLAVE USER    SLAVE 
                                          SLAVE NODE          STATUS    
            CRAWL STATUS    LAST_SYNCED          
          -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
          master-host01.me.com 
              geotest       /gpool/brick03/geotest    guser        
            guser@slave-host01::geotestdr    N/A                 	Faulty
                N/A             N/A                  
          master-host02.me.com 
              geotest       /gpool/brick03/geotest    guser        
            guser@slave-host01::geotestdr    slave-host01    		Passive 
              N/A             N/A                  
          master-host03.me.com 
              geotest       /gpool/brick03/geotest    guser        
            guser@slave-host01::geotestdr    slave-host03    		Passive 
              N/A             N/A    

            [2016-03-30
              22:09:31.326898] I [monitor(monitor):221:monitor] Monitor:
------------------------------------------------------------
            [2016-03-30
              22:09:31.327461] I [monitor(monitor):222:monitor] Monitor:
              starting gsyncd worker
            [2016-03-30
              22:09:31.544631] I
              [gsyncd(/gpool/brick03/geotest):649:main_i] <top>:
              syncing: gluster://localhost:geotest
              -> ssh://guser@slave-host02:gluster://localhost:geotestdr
            [2016-03-30
              22:09:31.547542] I [changelogagent(agent):75:__init__]
              ChangelogAgent: Agent listining...
            [2016-03-30
              22:09:31.830554] E
              [syncdutils(/gpool/brick03/geotest):252:log_raise_exception]
              <top>: connection to peer is broken
            [2016-03-30
              22:09:31.831017] W
              [syncdutils(/gpool/brick03/geotest):256:log_raise_exception]
              <top>: !!!!!!!!!!!!!
            [2016-03-30
              22:09:31.831258] W
              [syncdutils(/gpool/brick03/geotest):257:log_raise_exception]
              <top>: !!! getting "No such file or directory"
              errors is most likely due to MISCONFIGURATION, please
              consult https://access.redhat.com/site/documentation/en-US/Red_Hat_Storage/2.1/html/Administration_Guide/chap-User_Guide-Geo_Rep-Preparation-Settingup_Environment.html
            [2016-03-30
              22:09:31.831502] W
              [syncdutils(/gpool/brick03/geotest):265:log_raise_exception]
              <top>: !!!!!!!!!!!!!
            [2016-03-30
              22:09:31.836395] E
              [resource(/gpool/brick03/geotest):222:errlog] Popen:
              command "ssh -oPasswordAuthentication=no
              -oStrictHostKeyChecking=no -i
              /var/lib/glusterd/geo-replication/secret.pem
              -oControlMaster=auto -S
              /tmp/gsyncd-aux-ssh-SfXvbB/de372ce5774b5d259c58c5c9522ffc8f.sock
              guser@slave-host02 /nonexistent/gsyncd --session-owner
              ec473e17-b933-4bf7-9eed-4c393f7aaf5d -N --listen --timeout
              120 gluster://localhost:geotestdr"
              returned with 127, saying:
            [2016-03-30
              22:09:31.836694] E
              [resource(/gpool/brick03/geotest):226:logerr] Popen:
              ssh> bash: /nonexistent/gsyncd: No such file or
              directory
            [2016-03-30
              22:09:31.837193] I
              [syncdutils(/gpool/brick03/geotest):220:finalize]
              <top>: exiting.
            [2016-03-30
              22:09:31.840569] I [repce(agent):92:service_loop]
              RepceServer: terminating on reaching EOF.
            [2016-03-30
              22:09:31.840993] I [syncdutils(agent):220:finalize]
              <top>: exiting.
            [2016-03-30
              22:09:31.840742] I [monitor(monitor):274:monitor] Monitor:
              worker(/gpool/brick03/geotest) died before establishing
              connection
            [2016-03-30
              22:09:42.130866] I [monitor(monitor):221:monitor] Monitor:
------------------------------------------------------------
            [2016-03-30
              22:09:42.131448] I [monitor(monitor):222:monitor] Monitor:
              starting gsyncd worker
            [2016-03-30
              22:09:42.348165] I
              [gsyncd(/gpool/brick03/geotest):649:main_i] <top>:
              syncing: gluster://localhost:geotest
              -> ssh://guser@slave-host02:gluster://localhost:geotestdr
            [2016-03-30
              22:09:42.349118] I [changelogagent(agent):75:__init__]
              ChangelogAgent: Agent listining...
            [2016-03-30
              22:09:42.653141] E
              [syncdutils(/gpool/brick03/geotest):252:log_raise_exception]
              <top>: connection to peer is broken
            [2016-03-30
              22:09:42.653656] W
              [syncdutils(/gpool/brick03/geotest):256:log_raise_exception]
              <top>: !!!!!!!!!!!!!
            [2016-03-30
              22:09:42.653898] W
              [syncdutils(/gpool/brick03/geotest):257:log_raise_exception]
              <top>: !!! getting "No such file or directory"
              errors is most likely due to MISCONFIGURATION, please
              consult https://access.redhat.com/site/documentation/en-US/Red_Hat_Storage/2.1/html/Administration_Guide/chap-User_Guide-Geo_Rep-Preparation-Settingup_Environment.html
            [2016-03-30
              22:09:42.654129] W
              [syncdutils(/gpool/brick03/geotest):265:log_raise_exception]
              <top>: !!!!!!!!!!!!!
            [2016-03-30
              22:09:42.659329] E
              [resource(/gpool/brick03/geotest):222:errlog] Popen:
              command "ssh -oPasswordAuthentication=no
              -oStrictHostKeyChecking=no -i
              /var/lib/glusterd/geo-replication/secret.pem
              -oControlMaster=auto -S
              /tmp/gsyncd-aux-ssh-6r8rxx/de372ce5774b5d259c58c5c9522ffc8f.sock
              guser@slave-host02 /nonexistent/gsyncd --session-owner
              ec473e17-b933-4bf7-9eed-4c393f7aaf5d -N --listen --timeout
              120 gluster://localhost:geotestdr"
              returned with 127, saying:
            [2016-03-30
              22:09:42.659626] E
              [resource(/gpool/brick03/geotest):226:logerr] Popen:
              ssh> bash: /nonexistent/gsyncd: No such file or
              directory
            [2016-03-30
              22:09:42.660140] I
              [syncdutils(/gpool/brick03/geotest):220:finalize]
              <top>: exiting.
            [2016-03-30
              22:09:42.662802] I [repce(agent):92:service_loop]
              RepceServer: terminating on reaching EOF.
            [2016-03-30
              22:09:42.663197] I [syncdutils(agent):220:finalize]
              <top>: exiting.
            [2016-03-30
              22:09:42.663024] I [monitor(monitor):274:monitor] Monitor:
              worker(/gpool/brick03/geotest) died before establishing
              connection

            —Bishoy

              On Mar 30, 2016, at 10:50 AM, Gmail <b.s.mikhael@xxxxxxxxx>
                wrote:

                I’ve tried changing the permissions to 777 on
                  /var/log/glusterfs on all the slave nodes, but still
                  no luck :(

                  here is the log from the master node
                    where I created and started the geo-replication
                    session.

                    [2016-03-30 17:14:53.463150]
                      I [monitor(monitor):221:monitor] Monitor:
                      ------------------------------------------------------------
                    [2016-03-30 17:14:53.463669]
                      I [monitor(monitor):222:monitor] Monitor: starting
                      gsyncd worker
                    [2016-03-30 17:14:53.603774]
                      I [changelogagent(agent):75:__init__]
                      ChangelogAgent: Agent listining...
                    [2016-03-30 17:14:53.604080]
                      I [gsyncd(/mnt/brick10/xfsvol2):649:main_i]
                      <top>: syncing: gluster://localhost:xfsvol2 -> ssh://guser@slave-host01:gluster://localhost:xfsvol2dr
                    [2016-03-30 17:14:54.210602]
                      E
                      [syncdutils(/mnt/brick10/xfsvol2):252:log_raise_exception]
                      <top>: connection to peer is broken
                    [2016-03-30 17:14:54.211117]
                      E [resource(/mnt/brick10/xfsvol2):222:errlog]
                      Popen: command "ssh -oPasswordAuthentication=no
                      -oStrictHostKeyChecking=no -i
                      /var/lib/glusterd/geo-replication/secret.pem
                      -oControlMaster=auto -S
                      /tmp/gsyncd-aux-ssh-evONxc/3bda60dc6e900c0833fed4e4fdfbd480.sock
                      guser@slave-host01 /nonexistent/gsyncd
                      --session-owner
                      ef9ccae5-0def-4a47-9a96-881a1896755c -N --listen
                      --timeout 120 gluster://localhost:xfsvol2dr"
                      returned with 1, saying:
                    [2016-03-30 17:14:54.211376]
                      E [resource(/mnt/brick10/xfsvol2):226:logerr]
                      Popen: ssh> [2016-03-30 17:14:53.933174] I
                      [cli.c:720:main] 0-cli: Started running
                      /usr/sbin/gluster with version 3.7.3
                    [2016-03-30 17:14:54.211631]
                      E [resource(/mnt/brick10/xfsvol2):226:logerr]
                      Popen: ssh> [2016-03-30 17:14:53.933225] I
                      [cli.c:608:cli_rpc_init] 0-cli: Connecting to
                      remote glusterd at localhost
                    [2016-03-30 17:14:54.211828]
                      E [resource(/mnt/brick10/xfsvol2):226:logerr]
                      Popen: ssh> [2016-03-30 17:14:54.074207] I
                      [MSGID: 101190]
                      [event-epoll.c:632:event_dispatch_epoll_worker]
                      0-epoll: Started thread with index 1
                    [2016-03-30 17:14:54.212017]
                      E [resource(/mnt/brick10/xfsvol2):226:logerr]
                      Popen: ssh> [2016-03-30 17:14:54.074302] I
                      [socket.c:2409:socket_event_handler] 0-transport:
                      disconnecting now
                    [2016-03-30 17:14:54.212199]
                      E [resource(/mnt/brick10/xfsvol2):226:logerr]
                      Popen: ssh> [2016-03-30 17:14:54.077207] I
                      [cli-rpc-ops.c:6230:gf_cli_getwd_cbk] 0-cli:
                      Received resp to getwd
                    [2016-03-30 17:14:54.212380]
                      E [resource(/mnt/brick10/xfsvol2):226:logerr]
                      Popen: ssh> [2016-03-30 17:14:54.077269] I
                      [input.c:36:cli_batch] 0-: Exiting with: 0
                    [2016-03-30 17:14:54.212584]
                      E [resource(/mnt/brick10/xfsvol2):226:logerr]
                      Popen: ssh> ERROR:root:FAIL: 
                    [2016-03-30 17:14:54.212774]
                      E [resource(/mnt/brick10/xfsvol2):226:logerr]
                      Popen: ssh> Traceback (most recent call last):
                    [2016-03-30 17:14:54.212954]
                      E [resource(/mnt/brick10/xfsvol2):226:logerr]
                      Popen: ssh>   File
                      "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py",
                      line 165, in main
                    [2016-03-30 17:14:54.213131]
                      E [resource(/mnt/brick10/xfsvol2):226:logerr]
                      Popen: ssh>     main_i()
                    [2016-03-30 17:14:54.213308]
                      E [resource(/mnt/brick10/xfsvol2):226:logerr]
                      Popen: ssh>   File
                      "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py",
                      line 638, in main_i
                    [2016-03-30 17:14:54.213500]
                      E [resource(/mnt/brick10/xfsvol2):226:logerr]
                      Popen: ssh>     startup(go_daemon=go_daemon,
                      log_file=log_file, label=label)
                    [2016-03-30 17:14:54.213690]
                      E [resource(/mnt/brick10/xfsvol2):226:logerr]
                      Popen: ssh>   File
                      "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py",
                      line 144, in startup
                    [2016-03-30 17:14:54.213890]
                      E [resource(/mnt/brick10/xfsvol2):226:logerr]
                      Popen: ssh>     GLogger._gsyncd_loginit(**kw)
                    [2016-03-30 17:14:54.214068]
                      E [resource(/mnt/brick10/xfsvol2):226:logerr]
                      Popen: ssh>   File
                      "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py",
                      line 107, in _gsyncd_loginit
                    [2016-03-30 17:14:54.214246]
                      E [resource(/mnt/brick10/xfsvol2):226:logerr]
                      Popen: ssh>    
                      cls.setup(label=kw.get('label'), **lkw)
                    [2016-03-30 17:14:54.214422]
                      E [resource(/mnt/brick10/xfsvol2):226:logerr]
                      Popen: ssh>   File
                      "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py",
                      line 79, in setup
                    [2016-03-30 17:14:54.214622]
                      E [resource(/mnt/brick10/xfsvol2):226:logerr]
                      Popen: ssh>     logging_handler =
                      handlers.WatchedFileHandler(lprm['filename'])
                    [2016-03-30 17:14:54.214802]
                      E [resource(/mnt/brick10/xfsvol2):226:logerr]
                      Popen: ssh>   File
                      "/usr/lib64/python2.6/logging/handlers.py", line
                      377, in __init__
                    [2016-03-30 17:14:54.214977]
                      E [resource(/mnt/brick10/xfsvol2):226:logerr]
                      Popen: ssh>    
                      logging.FileHandler.__init__(self, filename, mode,
                      encoding, delay)
                    [2016-03-30 17:14:54.215152]
                      E [resource(/mnt/brick10/xfsvol2):226:logerr]
                      Popen: ssh>   File
                      "/usr/lib64/python2.6/logging/__init__.py", line
                      835, in __init__
                    [2016-03-30 17:14:54.215327]
                      E [resource(/mnt/brick10/xfsvol2):226:logerr]
                      Popen: ssh>     StreamHandler.__init__(self,
                      self._open())
                    [2016-03-30 17:14:54.215523]
                      E [resource(/mnt/brick10/xfsvol2):226:logerr]
                      Popen: ssh>   File
                      "/usr/lib64/python2.6/logging/__init__.py", line
                      854, in _open
                    [2016-03-30 17:14:54.215703]
                      E [resource(/mnt/brick10/xfsvol2):226:logerr]
                      Popen: ssh>     stream =
                      open(self.baseFilename, self.mode)
                    [2016-03-30 17:14:54.215883]
                      E [resource(/mnt/brick10/xfsvol2):226:logerr]
                      Popen: ssh> IOError: [Errno 13] Permission
                      denied:
'/var/log/glusterfs/geo-replication-slaves/mbr/ef9ccae5-0def-4a47-9a96-881a1896755c:gluster%3A%2F%2F127.0.0.1%3Axfsvol2dr.log'
                    [2016-03-30 17:14:54.216063]
                      E [resource(/mnt/brick10/xfsvol2):226:logerr]
                      Popen: ssh> failed with IOError.
                    [2016-03-30 17:14:54.216500]
                      I [syncdutils(/mnt/brick10/xfsvol2):220:finalize]
                      <top>: exiting.
                    [2016-03-30 17:14:54.218672]
                      I [repce(agent):92:service_loop] RepceServer:
                      terminating on reaching EOF.
                    [2016-03-30 17:14:54.219063]
                      I [syncdutils(agent):220:finalize] <top>:
                      exiting.
                    [2016-03-30 17:14:54.218930]
                      I [monitor(monitor):274:monitor] Monitor:
                      worker(/mnt/brick10/xfsvol2) died before
                      establishing connection

                    —Bishoy

                        On Mar 29, 2016, at 1:05 AM,
                          Aravinda <avishwan@xxxxxxxxxx>
                          wrote:

                            Geo-replication command should be run as
                            privileged user itself.

                            gluster volume geo-replication
                            <MASTERVOL>
                            <SLAVEUSER>@<SLAVEHOST> start

                            and then check the status, if it shows
                            Faulty then please share the log files
                            present in
                            /var/log/glusterfs/geo-replication/<MASTERVOL>/*.log

                            regards
Aravinda
                            On 03/29/2016
                              12:51 PM, Gmail wrote:

                              I’ve been trying to setup geo-replication
                              using Gluster 3.7.3 on OEL 6.5
                              It keeps giving me faulty
                                session.
                              I’ve tried to use root user
                                instead, it works fine!

                              I’ve followed literally the
                                documentation but no luck getting the
                                unprivileged user working.

                              I’ve tried running /usr/libexec/glusterfs/gsyncd
                                on the slave node using
                                the unprivileged user, and that’s what I
                                get.

                                /usr/libexec/glusterfs/gsyncd 
                                  --session-owner
                                  ef9ccae5-0def-4a47-9a96-881a1896755c
                                  -N --listen --timeout 120 gluster://localhost:vol01dr
                                [2016-03-29
                                  00:52:49.058244] I [cli.c:720:main]
                                  0-cli: Started running
                                  /usr/sbin/gluster with version 3.7.3
                                [2016-03-29
                                  00:52:49.058297] I
                                  [cli.c:608:cli_rpc_init] 0-cli:
                                  Connecting to remote glusterd at
                                  localhost
                                [2016-03-29
                                  00:52:49.174686] I [MSGID: 101190]
                                  [event-epoll.c:632:event_dispatch_epoll_worker]
                                  0-epoll: Started thread with index 1
                                [2016-03-29
                                  00:52:49.174768] I
                                  [socket.c:2409:socket_event_handler]
                                  0-transport: disconnecting now
                                [2016-03-29
                                  00:52:49.177482] I
                                  [cli-rpc-ops.c:6230:gf_cli_getwd_cbk]
                                  0-cli: Received resp to getwd
                                [2016-03-29
                                  00:52:49.177545] I
                                  [input.c:36:cli_batch] 0-: Exiting
                                  with: 0
                                ERROR:root:FAIL: 
                                Traceback
                                  (most recent call last):

                                  File
                                  "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py",
                                  line 165, in main

                                    main_i()

                                  File
                                  "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py",
                                  line 638, in main_i

                                    startup(go_daemon=go_daemon,
                                  log_file=log_file, label=label)

                                  File
                                  "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py",
                                  line 144, in startup

                                    GLogger._gsyncd_loginit(**kw)

                                  File
                                  "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py",
                                  line 107, in _gsyncd_loginit

                                    cls.setup(label=kw.get('label'),
                                  **lkw)

                                  File
                                  "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py",
                                  line 79, in setup

                                    logging_handler =
                                  handlers.WatchedFileHandler(lprm['filename'])

                                  File
                                  "/usr/lib64/python2.6/logging/handlers.py",
                                  line 377, in __init__

                                    logging.FileHandler.__init__(self,
                                  filename, mode, encoding, delay)

                                  File
                                  "/usr/lib64/python2.6/logging/__init__.py",
                                  line 835, in __init__

                                    StreamHandler.__init__(self,
                                  self._open())

                                  File
                                  "/usr/lib64/python2.6/logging/__init__.py",
                                  line 854, in _open

                                    stream = open(self.baseFilename,
                                  self.mode)
                                IOError:
                                  [Errno 13] Permission denied:
'/var/log/glusterfs/geo-replication-slaves/mbr/ef9ccae5-0def-4a47-9a96-881a1896755c:gluster%3A%2F%2F127.0.0.1%3Avol01dr.log'
                                failed
                                  with IOError.

                                        —
                                            Bishoy

                              _______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users