Re: Porblem creating Geo-replication

Aravinda <avishwan@xxxxxxxxxx> · Mon, 15 Dec 2014 11:40:08 +0530

    On 12/14/2014 02:57 AM, wodel youchi
      wrote:

      Hi again,

        I had a hard time to configure geo-replication. After creating
        the session, the start gave me faulty state and I had to deal
        with it, it was the /nonexistent/gsyncd in
        /var/lib/glusterd/geo-replication/data1_node3.example_data2/gsyncd.conf
        to change to /usr/libexec/gluster/gsyncd

    When you get /nonexistent/gsyncd, that means issue with pem keys
    setup. For security reason while pushing the pem keys to slave, we
    add command=<COMMAND TO RUN> before key in
    /root/.ssh/authorized_keys.

    For example,

    command="/usr/local/libexec/glusterfs/gsyncd"  ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAqxqMiZ8dyXUQq0pLVOYpRSsC+aYFn6pbPQZ3LtRPKGYfA63SNoYni
fhnM2UR9fnZz3hisBUxIzcVrVux2y3ojI/vPFFi08tVtK8/r9yoqqh3HqlBnotY50H1/1qeco+71U9hy276fUONP64KoOZtme3MwYuoNz
4z1NvCQFcEbXtPfHO5A9P3C+NuMhgNK8N63RSCzZ6dtO+wZygbVJlbPNQxp8Y5E8rbIuzRy6bD/0nmEKc/nqvEYTYgkck
ES0Xy92JVxbcwCOnZFNi4rT6+HarDIuFRB835I5ss+QBrT9SM09qmFuQ== root@fedoravm1

If you have any other entry in /root/.ssh/authorized_keys file of slave machine with same key then you will get /nonexistent/gsyncd error. 

Advantage of having "command" in authorized key is, if any master node is compromised, 
then they can login to any slave node. But with this command option, logged in user is 
only limited to use gsyncd or the command specified in authorized_keys.

        after that the geo-started, but nothing happened, no file or
        directory was synced.

        the gluster volume geo-replication data1
        geoaccount@xxxxxxxxxxxxxxxxx::data2 status

        keeps saying: Changelog Crawl

        [2014-12-13 21:57:17.129836] W
        [master(/mnt/srv1/brick1):1005:process] _GMaster: incomplete
        sync, retrying changelogs: CHANGELOG.1418504186

        [2014-12-13 21:57:22.648163] W
        [master(/mnt/srv1/brick1):294:regjob] _GMaster: Rsync:
        .gfid/f066ca4a-2d31-4342-bc7e-a37da25b2253 [errcode: 23]

        [2014-12-13 21:57:22.648426] W
        [master(/mnt/srv1/brick1):986:process] _GMaster: changelogs
        CHANGELOG.1418504186 could not be processed - moving on...

        but new files/directories are synced, so I deleted all data on
        the master cluster and recreated them, and all was synced

        but the status command (above) keeps saying Changelog Crawl

        On the slave node I have these logs

        [2014-12-13 21:11:12.149041] W
        [client-rpc-fops.c:1210:client3_3_removexattr_cbk]
        0-data2-client-0: remote operation failed: No data available

        [2014-12-13 21:11:12.149067] W [fuse-bridge.c:1261:fuse_err_cbk]
        0-glusterfs-fuse: 3243: REMOVEXATTR()
        /.gfid/cd26bcc2-9b9f-455a-a2b2-9b6358f24203 => -1 (No data
        available)

        [2014-12-13 21:11:12.516674] W
        [client-rpc-fops.c:1210:client3_3_removexattr_cbk]
        0-data2-client-0: remote operation failed: No data available

        [2014-12-13 21:11:12.516705] W [fuse-bridge.c:1261:fuse_err_cbk]
        0-glusterfs-fuse: 3325: REMOVEXATTR()
        /.gfid/e6142b37-2362-4c95-a291-f396a122b014 => -1 (No data
        available)

        [2014-12-13 21:11:12.517577] W
        [client-rpc-fops.c:1210:client3_3_removexattr_cbk]
        0-data2-client-0: remote operation failed: No data available

        [2014-12-13 21:11:12.517600] W [fuse-bridge.c:1261:fuse_err_cbk]
        0-glusterfs-fuse: 3331: REMOVEXATTR()
        /.gfid/e6142b37-2362-4c95-a291-f396a122b014 => -1 (No data
        available)

        and

        [2014-12-13 21:57:16.741321] W
        [syncdutils(slave):480:errno_wrap] <top>: reached maximum
        retries (['.gfid/ba9c75ef-d4f7-4a6b-923f-82a8c7be4443',
        'glusterfs.gfid.newfile',
'\x00\x00\x00\x1b\x00\x00\x00\x1bcab3ae81-7b52-4c55-ac33-37814ff374c4\x00\x00\x00\x81\xb0glustercli1.lower-test\x00\x00\x00\x01\xb0\x00\x00\x00\x00\x00\x00\x00\x00'])...

        [2014-12-13 21:57:22.400269] W
        [syncdutils(slave):480:errno_wrap] <top>: reached maximum
        retries (['.gfid/ba9c75ef-d4f7-4a6b-923f-82a8c7be4443',
        'glusterfs.gfid.newfile',
'\x00\x00\x00\x1b\x00\x00\x00\x1bcab3ae81-7b52-4c55-ac33-37814ff374c4\x00\x00\x00\x81\xb0glustercli1.lower-test\x00\x00\x00\x01\xb0\x00\x00\x00\x00\x00\x00\x00\x00'])...

        I didn't find what does this mean

    Does your slave had data before Geo-replication session is created.
    From the log what I can see is, geo-rep is failing to create a file
    in slave(may be same file name exists with different GFID(GFID is
    GlusterFS unique identifier for file))

    Rsync is failing since file is not created in slave, and it is
    unable to sync.

    --

    regards

    Aravinda

      any idea.

        Regards

                Le Vendredi
                  12 décembre 2014 19h35, wodel youchi
                  <wodel_doom@xxxxxxxx> a écrit :

                      Thanks
                          for your reply,

                      When executing the gverify.sh
                        script I had these errors on slave.log
                      [2014-12-12
                        18:12:45.423669] I
                        [options.c:1163:xlator_option_init_double]
                        0-fuse: option attribute-timeout convertion
                        failed value 1.0

                        [2014-12-12 18:12:45.423689] E
                        [xlator.c:425:xlator_init] 0-fuse:
                        Initialization of volume 'fuse' failed, review
                        your volfile again

                      I think that the
                          problem is linked to the locale variables,
                          mine were

                      LANG=fr_FR.UTF-8

                          LC_CTYPE="fr_FR.UTF-8"

                      LC_NUMERIC="fr_FR.UTF-8"

                      LC_TIME="fr_FR.UTF-8"

                          LC_COLLATE="fr_FR.UTF-8"

                          LC_MONETARY="fr_FR.UTF-8"

                          LC_MESSAGES="fr_FR.UTF-8"

                          LC_PAPER="fr_FR.UTF-8"

                          LC_NAME="fr_FR.UTF-8"

                          LC_ADDRESS="fr_FR.UTF-8"

                          LC_TELEPHONE="fr_FR.UTF-8"

                          LC_MEASUREMENT="fr_FR.UTF-8"

                          LC_IDENTIFICATION="fr_FR.UTF-8"

                          LC_ALL=

                      I changed LC_CTYPE and LC_NUMERIC to C and then
                          executed the gverify.sh script again and it
                          worked, but the gluster vol geo-rep ...
                          failed.

                      I then changed the /etc/locale.conf
                          file and modified the LANG from fr_FR.UTF-8 to C, rebooted
                          the VM and voila, the geo-replication session
                          was created successfuly.

                      but
                        I am not sure if my changes won't affect other
                        things.

                      Regards

                                Le Vendredi 12 décembre 2014
                                  8h30, Kotresh Hiremath Ravishankar
                                  <khiremat@xxxxxxxxxx> a écrit :

                              Hi,

                                The setup is failing while doing
                                compatibility test between master and
                                slave cluster.

                                The gverify.sh script is failing to get
                                master volume details for the same.

                                Could you run the following and paste
                                the output here?

                                bash -x
                                /usr/local/libexec/glusterfs/gverify.sh
                                <master-vol-name> root
                                <slave-host-name>
                                <slave-vol-name>
                                <temp-log-file>

                                If source installed gverify.sh is found
                                in above path where as if rpm install,

                                it is found in
                                /usr/libexec/glusterfs/gverify.sh

                                If you are sure the master and slave
                                gluster versions and size is fine, the
                                easy workaround

                                is to use force.

                                gluster vol geo-replication
                                <master-vol>
                                <slave-host>::<slave-vol>
                                create push-pem force

                                Thanks and Regards,

                                Kotresh H R

                                  ----- Original Message -----

                                  From: "wodel youchi" <wodel_doom@xxxxxxxx>

                                  To: gluster-users@xxxxxxxxxxx

                                  Sent: Friday, December 12, 2014
                                  3:13:48 AM

                                  Subject:  Porblem
                                  creating Geo-replication

                                  Hi, 

                                  I am using Centos7x64 updates 

                                  GlusterFS 3.6 from http://download.gluster.org/pub/gluster/glusterfs/LATEST/CentOS/epel-7Server/
                                  repository. 

                                  No firewall and No Selinux. 

                                  I've two nodes with distributed
                                  replicated volume: data1 

                                  and a third node with a distributed
                                  volume: data2 

                                  the two volumes have the same size 

                                  I've trouble to configure
                                  geo-replication to the third node,
                                  I've been following the RedHat Storage
                                  3 Admin Guide, but It does not work. 

                                  I've created the ssh-passwordless
                                  connection between the nodes, and
                                  followed these commands 

                                  On the master: 

                                  # gluster system: : execute
                                  gsec_create 

                                  Common secret pub file present at
                                  /var/lib/glusterd/geo-replication/common_secret.pem.pub

                                  # gluster volume geo-replication data1
                                  node3.example.com::data2 create
                                  push-pem 

                                  Unable to fetch master volume details.
                                  Please check the master cluster and
                                  master volume. 

                                  geo-replication command failed 

                                  from
                                  /var/log/glusterfs/etc-glusterfs-glusterd.vol.log
                                  I get these error messges 

                                  [2014-12-11 21:34:47.152644] E
                                  [glusterd-geo-rep.c:2012:glusterd_verify_slave]
                                  0-: Not a valid slave 

                                  [2014-12-11 21:34:47.152750] E
                                  [glusterd-geo-rep.c:2240:glusterd_op_stage_gsync_create]
                                  0-: node3.example.com::data2 is not a
                                  valid slave volume. Error: Unable to
                                  fetch master volume details. Please
                                  check the master cluster and master
                                  volume. 

                                  [2014-12-11 21:34:47.152764] E
                                  [glusterd-syncop.c:1151:gd_stage_op_phase]
                                  0-management: Staging of operation
                                  'Volume Geo-replication Create' failed
                                  on localhost : Unable to fetch master
                                  volume details. Please check the
                                  master cluster and master volume. 

                                  [2014-12-11 21:35:25.559144] E
                                  [glusterd-handshake.c:914:gd_validate_mgmt_hndsk_req]
                                  0-management: Rejecting management
                                  handshake request from unknown peer
                                  192.168.1.9:1005 

                                  the 192.168.1.9 is the IP address of
                                  the 3rd node. 

                                  any idea!!. 

                                  thanks 

_______________________________________________

                                  Gluster-users mailing list

                                  Gluster-users@xxxxxxxxxxx

                                  http://supercolony.gluster.org/mailman/listinfo/gluster-users

                _______________________________________________

                  Gluster-users mailing list

                  Gluster-users@xxxxxxxxxxx

                  http://supercolony.gluster.org/mailman/listinfo/gluster-users

      _______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users