Re: [ovirt-users] ovirt 4.1 hosted engine hyper converged on glusterfs 3.8.10 : "engine" storage domain alway complain about "unsynced" elements

Sahina Bose <sabose@xxxxxxxxxx> · Tue, 25 Jul 2017 11:57:21 +0530

On Tue, Jul 25, 2017 at 11:12 AM, Kasturi Narra <knarra@xxxxxxxxxx> wrote:
These errors are because not having glusternw assigned to the correct interface. Once you attach that these errors should go away.  This has nothing to do with the problem you are seeing. sahina any idea about engine not showing the correct volume info ?

Please provide the vdsm.log (contianing the gluster volume info) and engine.log 

On Mon, Jul 24, 2017 at 7:30 PM, yayo (j) <jaganz@xxxxxxxxx> wrote:
Hi,
UI refreshed but problem still remain ... 

No specific error, I've only these errors but I've read that there is no problem if I have this kind of errors:

2017-07-24 15:53:59,823+02 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand] (DefaultQuartzScheduler2) [b7590c4] START, GlusterServersListVDSCommand(HostName = node01.localdomain.local, VdsIdVDSCommandParametersBase:{runAsync='true', hostId='4c89baa5-e8f7-4132-a4b3-af332247570c'}), log id: 29a62417
2017-07-24 15:54:01,066+02 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand] (DefaultQuartzScheduler2) [b7590c4] FINISH, GlusterServersListVDSCommand, return: [10.10.20.80/24:CONNECTED, node02.localdomain.local:CONNECTED, gdnode04:CONNECTED], log id: 29a62417
2017-07-24 15:54:01,076+02 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand] (DefaultQuartzScheduler2) [b7590c4] START, GlusterVolumesListVDSCommand(HostName = node01.localdomain.local, GlusterVolumesListVDSParameters:{runAsync='true', hostId='4c89baa5-e8f7-4132-a4b3-af332247570c'}), log id: 7fce25d3
2017-07-24 15:54:02,209+02 WARN  [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListReturn] (DefaultQuartzScheduler2) [b7590c4] Could not associate brick 'gdnode01:/gluster/engine/brick' of volume 'd19c19e3-910d-437b-8ba7-4f2a23d17515' with correct network as no gluster network found in cluster '00000002-0002-0002-0002-00000000017a'
2017-07-24 15:54:02,212+02 WARN  [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListReturn] (DefaultQuartzScheduler2) [b7590c4] Could not associate brick 'gdnode02:/gluster/engine/brick' of volume 'd19c19e3-910d-437b-8ba7-4f2a23d17515' with correct network as no gluster network found in cluster '00000002-0002-0002-0002-00000000017a'
2017-07-24 15:54:02,215+02 WARN  [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListReturn] (DefaultQuartzScheduler2) [b7590c4] Could not associate brick 'gdnode04:/gluster/engine/brick' of volume 'd19c19e3-910d-437b-8ba7-4f2a23d17515' with correct network as no gluster network found in cluster '00000002-0002-0002-0002-00000000017a'
2017-07-24 15:54:02,218+02 WARN  [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListReturn] (DefaultQuartzScheduler2) [b7590c4] Could not associate brick 'gdnode01:/gluster/data/brick' of volume 'c7a5dfc9-3e72-4ea1-843e-c8275d4a7c2d' with correct network as no gluster network found in cluster '00000002-0002-0002-0002-00000000017a'
2017-07-24 15:54:02,221+02 WARN  [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListReturn] (DefaultQuartzScheduler2) [b7590c4] Could not associate brick 'gdnode02:/gluster/data/brick' of volume 'c7a5dfc9-3e72-4ea1-843e-c8275d4a7c2d' with correct network as no gluster network found in cluster '00000002-0002-0002-0002-00000000017a'
2017-07-24 15:54:02,224+02 WARN  [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListReturn] (DefaultQuartzScheduler2) [b7590c4] Could not associate brick 'gdnode04:/gluster/data/brick' of volume 'c7a5dfc9-3e72-4ea1-843e-c8275d4a7c2d' with correct network as no gluster network found in cluster '00000002-0002-0002-0002-00000000017a'
2017-07-24 15:54:02,224+02 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand] (DefaultQuartzScheduler2) [b7590c4] FINISH, GlusterVolumesListVDSCommand, return: {d19c19e3-910d-437b-8ba7-4f2a23d17515=org.ovirt.engine.core.common.businessentities.gluster.GlusterVolumeEntity@fdc91062, c7a5dfc9-3e72-4ea1-843e-c8275d4a7c2d=org.ovirt.engine.core.common.businessentities.gluster.GlusterVolumeEntity@999a6f23}, log id: 7fce25d3

Thank you

2017-07-24 8:12 GMT+02:00 Kasturi Narra <knarra@xxxxxxxxxx>:
Hi,
   Regarding the UI showing incorrect information about engine and data volumes, can you please refresh the UI and see if the issue persists  plus any errors in the engine.log files ?

Thanks
kasturi

On Sat, Jul 22, 2017 at 11:43 AM, Ravishankar N <ravishankar@xxxxxxxxxx> wrote:

    On 07/21/2017 11:41 PM, yayo (j) wrote:

      Hi,

        Sorry for follow up again, but, checking the ovirt
          interface I've found that ovirt report the "engine" volume as
          an "arbiter" configuration and the "data" volume as full
          replicated volume. Check these screenshots:

    This is probably some refresh bug in the UI, Sahina might be able to
    tell you.

        https://drive.google.com/drive/folders/0ByUV7xQtP1gCTE8tUTFfVmR5aDQ?usp=sharing

        But the "gluster volume info" command report that all 2
          volume are full replicated:

            Volume Name: data

            Type: Replicate

            Volume ID: c7a5dfc9-3e72-4ea1-843e-c8275d4a7c2d

            Status: Started

            Snapshot Count: 0

            Number of Bricks: 1 x 3 = 3

            Transport-type: tcp

            Bricks:

            Brick1: gdnode01:/gluster/data/brick

            Brick2: gdnode02:/gluster/data/brick

            Brick3: gdnode04:/gluster/data/brick

            Options Reconfigured:

            nfs.disable: on

            performance.readdir-ahead: on

            transport.address-family: inet

            storage.owner-uid: 36

            performance.quick-read: off

            performance.read-ahead: off

            performance.io-cache: off

            performance.stat-prefetch: off

            performance.low-prio-threads: 32

            network.remote-dio: enable

            cluster.eager-lock: enable

            cluster.quorum-type: auto

            cluster.server-quorum-type: server

            cluster.data-self-heal-algorithm: full

            cluster.locking-scheme: granular

            cluster.shd-max-threads: 8

            cluster.shd-wait-qlength: 10000

            features.shard: on

            user.cifs: off

            storage.owner-gid: 36

            features.shard-block-size: 512MB

            network.ping-timeout: 30

            performance.strict-o-direct: on

            cluster.granular-entry-heal: on

            auth.allow: *

            server.allow-insecure: on

              Volume Name: engine

              Type: Replicate

              Volume ID:
                  d19c19e3-910d-437b-8ba7-4f2a23d17515

              Status: Started

              Snapshot Count: 0

              Number of Bricks: 1 x 3 = 3

              Transport-type: tcp

              Bricks:

              Brick1:
                  gdnode01:/gluster/engine/brick

              Brick2:
                  gdnode02:/gluster/engine/brick

              Brick3:
                  gdnode04:/gluster/engine/brick

              Options Reconfigured:

              nfs.disable: on

              performance.readdir-ahead: on

              transport.address-family: inet

              storage.owner-uid: 36

              performance.quick-read: off

              performance.read-ahead: off

              performance.io-cache: off

              performance.stat-prefetch: off

              performance.low-prio-threads:
                  32

              network.remote-dio: off

              cluster.eager-lock: enable

              cluster.quorum-type: auto

              cluster.server-quorum-type:
                  server

              cluster.data-self-heal-algorithm:
                  full

              cluster.locking-scheme:
                  granular

              cluster.shd-max-threads: 8

              cluster.shd-wait-qlength:
                  10000

              features.shard: on

              user.cifs: off

              storage.owner-gid: 36

              features.shard-block-size:
                  512MB

              network.ping-timeout: 30

              performance.strict-o-direct:
                  on

              cluster.granular-entry-heal:
                  on

              auth.allow: *

                      server.allow-insecure: on

        2017-07-21 19:13 GMT+02:00 yayo (j) <jaganz@xxxxxxxxx>:

                2017-07-20 14:48
                    GMT+02:00 Ravishankar N <ravishankar@xxxxxxxxxx>:

                         But it does  say something. All these
                        gfids of completed heals in the log below are
                        the for the ones that you have given the
                        getfattr output of. So what is likely happening
                        is there is an intermittent connection problem
                        between your mount and the brick process,
                        leading to pending heals again after the heal
                        gets completed, which is why the numbers are
                        varying each time. You would need to check why
                        that is the case.

                        Hope this helps,

                        Ravi

                                        [2017-07-20
                                            09:58:46.573079] I [MSGID:
                                            108026]
                                            [afr-self-heal-common.c:1254:afr_log_selfheal]
                                            0-engine-replicate-0:
                                            Completed data selfheal on
                                            e6dfd556-340b-4b76-b47b-7b6f5bd74327.
                                            sources=[0] 1  sinks=2

                                        [2017-07-20
                                            09:59:22.995003] I [MSGID:
                                            108026]
                                            [afr-self-heal-metadata.c:51:__afr_selfheal_metadata_do]
                                            0-engine-replicate-0:
                                            performing metadata selfheal
                                            on
                                            f05b9742-2771-484a-85fc-5b6974bcef81

                                        [2017-07-20
                                            09:59:22.999372] I [MSGID:
                                            108026]
                                            [afr-self-heal-common.c:1254:afr_log_selfheal]
                                            0-engine-replicate-0:
                                            Completed metadata selfheal
                                            on
                                            f05b9742-2771-484a-85fc-5b6974bcef81.
                                            sources=[0] 1  sinks=2

                  Hi,

                  following your suggestion, I've checked the
                    "peer" status and I found that there is too many
                    name for the hosts, I don't know if this can be the
                    problem or part of it:

                      gluster peer status on NODE01:

                      Number of Peers: 2

                      Hostname: dnode02.localdomain.local

                      Uuid: 7c0ebfa3-5676-4d3f-9bfa-7fff6afea0dd

                      State: Peer in Cluster (Connected)

                      Other names:

                      192.168.10.52

                      dnode02.localdomain.local

                      10.10.20.90

                      10.10.10.20

                      gluster peer status on NODE02:

                      Number of Peers: 2

                      Hostname: dnode01.localdomain.local

                      Uuid: a568bd60-b3e4-4432-a9bc-996c52eaaa12

                      State: Peer in Cluster (Connected)

                      Other names:

                      gdnode01

                      10.10.10.10

                      Hostname: gdnode04

                      Uuid: ce6e0f6b-12cf-4e40-8f01-d1609dfc5828

                      State: Peer in Cluster (Connected)

                      Other names:

                      192.168.10.54

                      10.10.10.40

                      gluster peer status on NODE04:

                      Number of Peers: 2

                      Hostname: dnode02.neridom.dom

                      Uuid: 7c0ebfa3-5676-4d3f-9bfa-7fff6afea0dd

                      State: Peer in Cluster (Connected)

                      Other names:

                      10.10.20.90

                      gdnode02

                      192.168.10.52

                      10.10.10.20

                      Hostname: dnode01.localdomain.local

                      Uuid: a568bd60-b3e4-4432-a9bc-996c52eaaa12

                      State: Peer in Cluster (Connected)

                      Other names:

                      gdnode01

                      10.10.10.10

                    All these ip are pingable and hosts resolvible
                      across all 3 nodes but, only the 10.10.10.0
                      network is the decidated network for gluster
                       (rosolved using gdnode* host names) ... You think
                      that remove other entries can fix the problem? So,
                      sorry, but, how can I remove other entries?  

    I don't think having extra entries could be a problem. Did you check
    the fuse mount logs for disconnect messages that I referred to in
    the other email?

                  And, what about the selinux? 

    Not sure about this. See if there are disconnect messages in the
    mount logs first.

    -Ravi

                  Thank you

        -- 

        Linux
          User: 369739 http://counter.li.org

_______________________________________________

Users mailing list

Users@xxxxxxxxx

http://lists.ovirt.org/mailman/listinfo/users

-- 
Linux User: 369739 http://counter.li.org

_______________________________________________

Gluster-users mailing list

Gluster-users@xxxxxxxxxxx

http://lists.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users