Re: [ovirt-users] 4.0 - 2nd node fails on deploy

"Jason Jeffrey" <jason@xxxxxxxxxx> · Wed, 5 Oct 2016 09:26:12 +0100

HI,

Logs attached

Thanks 

From: Sahina Bose [mailto:sabose@xxxxxxxxxx] 
Sent: 05 October 2016 08:11
To: Jason Jeffrey <jason@xxxxxxxxxx>; gluster-users@xxxxxxxxxxx; Ravishankar Narayanankutty <ravishankar@xxxxxxxxxx>
Cc: Simone Tiraboschi <stirabos@xxxxxxxxxx>; users <users@xxxxxxxxx>
Subject: Re: [ovirt-users] 4.0 - 2nd node fails on deploy

[Adding gluster-users ML]
The brick logs are filled with errors :
[2016-10-05 19:30:28.659061] E [MSGID: 113077] [posix-handle.c:309:posix_handle_pump] 0-engine-posix: malformed internal link /var/run/vdsm/storage/0a021563-91b5-4f49-9c6b-fff45e85a025/d84f0551-0f2b-457c-808c-6369c6708d43/1b5a5e34-818c-4914-8192-2f05733b5583 for /xpool/engine/brick/.glusterfs/b9/8e/b98ed8d2-3bf9-4b11-92fd-ca5324e131a8 
[2016-10-05 19:30:28.659069] E [MSGID: 113091] [posix.c:180:posix_lookup] 0-engine-posix: Failed to create inode handle for path <gfid:b98ed8d2-3bf9-4b11-92fd-ca5324e131a8> 
The message "E [MSGID: 113018] [posix.c:198:posix_lookup] 0-engine-posix: lstat on null failed" repeated 3 times between [2016-10-05 19:30:28.656529] and [2016-10-05 19:30:28.659076] 
[2016-10-05 19:30:28.659087] W [MSGID: 115005] [server-resolve.c:126:resolve_gfid_cbk] 0-engine-server: b98ed8d2-3bf9-4b11-92fd-ca5324e131a8: failed to resolve (Success) 
- Ravi, the above are from the data brick of the arbiter volume. Can you take a look?

Jason,
Could you also provide the mount logs from the first host (/var/log/glusterfs/rhev-data-center-mnt-glusterSD*engine.log) and glusterd log (/var/log/glusterfs/etc-glusterfs-glusterd.vol.log) around the same time frame.

On Wed, Oct 5, 2016 at 3:28 AM, Jason Jeffrey <jason@xxxxxxxxxx> wrote:
Hi,

Servers are powered  off  when I’m not looking at the problem.

There may have been instances where all three were not powered on, during the same period.

Glusterhd log attached, the xpool-engine-brick log is over 1 GB in size, I’ve taken a sample of the last  couple days, looks to be highly repative.

Cheers

Jason

From: Simone Tiraboschi [mailto:stirabos@xxxxxxxxxx] 
Sent: 04 October 2016 16:50

To: Jason Jeffrey <jason@xxxxxxxxxx>
Cc: users <users@xxxxxxxxx>
Subject: Re: [ovirt-users] 4.0 - 2nd node fails on deploy

On Tue, Oct 4, 2016 at 5:22 PM, Jason Jeffrey <jason@xxxxxxxxxx> wrote:
Hi,

DCASTORXX is a hosts entry for dedicated  direct 10GB links (each private /28) between the x3 servers  i.e 1=> 2&3, 2=> 1&3, etc) planned to be used solely for storage.

I,e 

10.100.50.81    dcasrv01
10.100.101.1    dcastor01
10.100.50.82    dcasrv02
10.100.101.2    dcastor02
10.100.50.83    dcasrv03
10.100.103.3    dcastor03  

These were setup with the gluster commands

·         gluster volume create iso replica 3 arbiter 1  dcastor01:/xpool/iso/brick   dcastor02:/xpool/iso/brick   dcastor03:/xpool/iso/brick
·         gluster volume create export replica 3 arbiter 1  dcastor02:/xpool/export/brick  dcastor03:/xpool/export/brick  dcastor01:/xpool/export/brick  
·         gluster volume create engine replica 3 arbiter 1 dcastor01:/xpool/engine/brick dcastor02:/xpool/engine/brick dcastor03:/xpool/engine/brick
·         gluster volume create data replica 3 arbiter 1  dcastor01:/xpool/data/brick  dcastor03:/xpool/data/brick  dcastor02:/xpool/data/bricky

So yes, DCASRV01 is the server (pri) and have local bricks access through DCASTOR01 interface 

Is the issue here not the incorrect soft link ?

No, this should be fine.

The issue is that periodically your gluster volume losses its server quorum and become unavailable.
It happened more than once from your logs.

Can you please attach also gluster logs for that volume?

lrwxrwxrwx. 1 vdsm kvm  132 Oct  3 17:27 hosted-engine.metadata -> /var/run/vdsm/storage/bbb70623-194a-46d2-a164-76a4876ecaaf/fd44dbf9-473a-496a-9996-c8abe3278390/cee9440c-4eb8-453b-bc04-c47e6f9cbc93    
[root@dcasrv01 /]# ls -al /var/run/vdsm/storage/bbb70623-194a-46d2-a164-76a4876ecaaf/
ls: cannot access /var/run/vdsm/storage/bbb70623-194a-46d2-a164-76a4876ecaaf/: No such file or directory   
But the data does exist 
[root@dcasrv01 fd44dbf9-473a-496a-9996-c8abe3278390]# ls -al
drwxr-xr-x. 2 vdsm kvm    4096 Oct  3 17:17 .
drwxr-xr-x. 6 vdsm kvm    4096 Oct  3 17:17 ..
-rw-rw----. 2 vdsm kvm 1028096 Oct  3 20:48 cee9440c-4eb8-453b-bc04-c47e6f9cbc93
-rw-rw----. 2 vdsm kvm 1048576 Oct  3 17:17 cee9440c-4eb8-453b-bc04-c47e6f9cbc93.lease
-rw-r--r--. 2 vdsm kvm     283 Oct  3 17:17 cee9440c-4eb8-453b-bc04-c47e6f9cbc93.meta   

Thanks 

Jason 

From: Simone Tiraboschi [mailto:stirabos@xxxxxxxxxx] 
Sent: 04 October 2016 14:40

To: Jason Jeffrey <jason@xxxxxxxxxx>
Cc: users <users@xxxxxxxxx>
Subject: Re: [ovirt-users] 4.0 - 2nd node fails on deploy

On Tue, Oct 4, 2016 at 10:51 AM, Simone Tiraboschi <stirabos@xxxxxxxxxx> wrote:

On Mon, Oct 3, 2016 at 11:56 PM, Jason Jeffrey <jason@xxxxxxxxxx> wrote:
Hi,

Another problem has appeared, after rebooting the primary the VM will not start.

Appears the symlink is broken between gluster mount ref and vdsm

The first host was correctly deployed but it seas that you are facing some issue connecting the storage.
Can you please attach vdsm logs and /var/log/messages from the first host?

Thanks Jason,
I suspect that your issue is related to this:
Oct  4 18:24:39 dcasrv01 etc-glusterfs-glusterd.vol[2252]: [2016-10-04 17:24:39.522620] C [MSGID: 106002] [glusterd-server-quorum.c:351:glusterd_do_volume_quorum_action] 0-management: Server quorum lost for volume data. Stopping local bricks.
Oct  4 18:24:39 dcasrv01 etc-glusterfs-glusterd.vol[2252]: [2016-10-04 17:24:39.523272] C [MSGID: 106002] [glusterd-server-quorum.c:351:glusterd_do_volume_quorum_action] 0-management: Server quorum lost for volume engine. Stopping local bricks.

and for some time your gluster volume has been working.

But then:
Oct  4 19:02:09 dcasrv01 systemd: Started /usr/bin/mount -t glusterfs -o backup-volfile-servers=dcastor02:dcastor03 dcastor01:engine /rhev/data-center/mnt/glusterSD/dcastor01:engine.
Oct  4 19:02:09 dcasrv01 systemd: Starting /usr/bin/mount -t glusterfs -o backup-volfile-servers=dcastor02:dcastor03 dcastor01:engine /rhev/data-center/mnt/glusterSD/dcastor01:engine.
Oct  4 19:02:11 dcasrv01 ovirt-ha-agent: /usr/lib/python2.7/site-packages/yajsonrpc/stomp.py:352: DeprecationWarning: Dispatcher.pending is deprecated. Use Dispatcher.socket.pending instead.
Oct  4 19:02:11 dcasrv01 ovirt-ha-agent: pending = getattr(dispatcher, 'pending', lambda: 0)
Oct  4 19:02:11 dcasrv01 ovirt-ha-agent: /usr/lib/python2.7/site-packages/yajsonrpc/stomp.py:352: DeprecationWarning: Dispatcher.pending is deprecated. Use Dispatcher.socket.pending instead.
Oct  4 19:02:11 dcasrv01 ovirt-ha-agent: pending = getattr(dispatcher, 'pending', lambda: 0)
Oct  4 19:02:11 dcasrv01 journal: vdsm vds.dispatcher ERROR SSL error during reading data: unexpected eof
Oct  4 19:02:11 dcasrv01 journal: ovirt-ha-agent ovirt_hosted_engine_ha.agent.agent.Agent ERROR Error: 'Connection to storage server failed' - trying to restart agent
Oct  4 19:02:11 dcasrv01 ovirt-ha-agent: ERROR:ovirt_hosted_engine_ha.agent.agent.Agent:Error: 'Connection to storage server failed' - trying to restart agent
Oct  4 19:02:12 dcasrv01 etc-glusterfs-glusterd.vol[2252]: [2016-10-04 18:02:12.384611] C [MSGID: 106003] [glusterd-server-quorum.c:346:glusterd_do_volume_quorum_action] 0-management: Server quorum regained for volume data. Starting local bricks.
Oct  4 19:02:12 dcasrv01 etc-glusterfs-glusterd.vol[2252]: [2016-10-04 18:02:12.388981] C [MSGID: 106003] [glusterd-server-quorum.c:346:glusterd_do_volume_quorum_action] 0-management: Server quorum regained for volume engine. Starting local bricks.

And at that point VDSM started complaining that the hosted-engine-storage domain doesn't exist anymore:
Oct  4 19:02:30 dcasrv01 journal: ovirt-ha-agent ovirt_hosted_engine_ha.lib.image.Image ERROR Error fetching volumes list: Storage domain does not exist: (u'bbb70623-194a-46d2-a164-76a4876ecaaf',)
Oct  4 19:02:30 dcasrv01 ovirt-ha-agent: ERROR:ovirt_hosted_engine_ha.lib.image.Image:Error fetching volumes list: Storage domain does not exist: (u'bbb70623-194a-46d2-a164-76a4876ecaaf',)

I see from the logs that the ovirt-ha-agent is trying to mount the hosted-engine storage domain as:
/usr/bin/mount -t glusterfs -o backup-volfile-servers=dcastor02:dcastor03 dcastor01:engine /rhev/data-center/mnt/glusterSD/dcastor01:engine.

Pointing to dcastor01, dcastor02 and dcastor03 while your server is dcasrv01.
But at the same time it seams that also dcasrv01 has local bricks for the same engine volume.

So, is dcasrv01 just an alias fro dcastor01? if not you probably have some issue with the configuration of your gluster volume.

From broker.log

Thread-169::ERROR::2016-10-04 22:44:16,189::storage_broker::138::ovirt_hosted_engine_ha.broker.storage_broker.StorageBroker::(get_raw_stats_for_service_type) Failed to read metadata from /rhev/data-center/mnt/glusterSD/dcastor01:engine/bbb70623-194a-46d2-a164-76a4876ecaaf/ha_agent/hosted-engine.metadata

[root@dcasrv01 ovirt-hosted-engine-ha]# ls -al /rhev/data-center/mnt/glusterSD/dcastor01\:engine/bbb70623-194a-46d2-a164-76a4876ecaaf/ha_agent/
total 9
drwxrwx---. 2 vdsm kvm 4096 Oct  3 17:27 .
drwxr-xr-x. 5 vdsm kvm 4096 Oct  3 17:17 ..
lrwxrwxrwx. 1 vdsm kvm  132 Oct  3 17:27 hosted-engine.lockspace -> /var/run/vdsm/storage/bbb70623-194a-46d2-a164-76a4876ecaaf/23d81b73-bcb7-4742-abde-128522f43d78/11d6a3e1-1817-429d-b2e0-9051a3cf41a4
lrwxrwxrwx. 1 vdsm kvm  132 Oct  3 17:27 hosted-engine.metadata -> /var/run/vdsm/storage/bbb70623-194a-46d2-a164-76a4876ecaaf/fd44dbf9-473a-496a-9996-c8abe3278390/cee9440c-4eb8-453b-bc04-c47e6f9cbc93    

[root@dcasrv01 /]# ls -al /var/run/vdsm/storage/bbb70623-194a-46d2-a164-76a4876ecaaf/
ls: cannot access /var/run/vdsm/storage/bbb70623-194a-46d2-a164-76a4876ecaaf/: No such file or directory   

Though file appears to be there 

Gluster is setup as xpool/engine 

[root@dcasrv01 fd44dbf9-473a-496a-9996-c8abe3278390]# pwd
/xpool/engine/brick/bbb70623-194a-46d2-a164-76a4876ecaaf/images/fd44dbf9-473a-496a-9996-c8abe3278390
[root@dcasrv01 fd44dbf9-473a-496a-9996-c8abe3278390]# ls -al
total 2060
drwxr-xr-x. 2 vdsm kvm    4096 Oct  3 17:17 .
drwxr-xr-x. 6 vdsm kvm    4096 Oct  3 17:17 ..
-rw-rw----. 2 vdsm kvm 1028096 Oct  3 20:48 cee9440c-4eb8-453b-bc04-c47e6f9cbc93
-rw-rw----. 2 vdsm kvm 1048576 Oct  3 17:17 cee9440c-4eb8-453b-bc04-c47e6f9cbc93.lease
-rw-r--r--. 2 vdsm kvm     283 Oct  3 17:17 cee9440c-4eb8-453b-bc04-c47e6f9cbc93.meta   

[root@dcasrv01 fd44dbf9-473a-496a-9996-c8abe3278390]# gluster volume info

Volume Name: data
Type: Replicate
Volume ID: 54fbcafc-fed9-4bce-92ec-fa36cdcacbd4
Status: Started
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: dcastor01:/xpool/data/brick
Brick2: dcastor03:/xpool/data/brick
Brick3: dcastor02:/xpool/data/bricky (arbiter)
Options Reconfigured:
performance.readdir-ahead: on
performance.quick-read: off
performance.read-ahead: off
performance.io-cache: off
performance.stat-prefetch: off
cluster.eager-lock: enable
network.remote-dio: enable
cluster.quorum-type: auto
cluster.server-quorum-type: server
storage.owner-uid: 36
storage.owner-gid: 36

Volume Name: engine
Type: Replicate
Volume ID: dd4c692d-03aa-4fc6-9011-a8dad48dad96
Status: Started
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: dcastor01:/xpool/engine/brick
Brick2: dcastor02:/xpool/engine/brick
Brick3: dcastor03:/xpool/engine/brick (arbiter)
Options Reconfigured:
performance.readdir-ahead: on
performance.quick-read: off
performance.read-ahead: off
performance.io-cache: off
performance.stat-prefetch: off
cluster.eager-lock: enable
network.remote-dio: enable
cluster.quorum-type: auto
cluster.server-quorum-type: server
storage.owner-uid: 36
storage.owner-gid: 36

Volume Name: export
Type: Replicate
Volume ID: 23f14730-d264-4cc2-af60-196b943ecaf3
Status: Started
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: dcastor02:/xpool/export/brick
Brick2: dcastor03:/xpool/export/brick
Brick3: dcastor01:/xpool/export/brick (arbiter)
Options Reconfigured:
performance.readdir-ahead: on
storage.owner-uid: 36
storage.owner-gid: 36

Volume Name: iso
Type: Replicate
Volume ID: b2d3d7e2-9919-400b-8368-a0443d48e82a
Status: Started
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: dcastor01:/xpool/iso/brick
Brick2: dcastor02:/xpool/iso/brick
Brick3: dcastor03:/xpool/iso/brick (arbiter)
Options Reconfigured:
performance.readdir-ahead: on
storage.owner-uid: 36
storage.owner-gid: 36                                   

[root@dcasrv01 fd44dbf9-473a-496a-9996-c8abe3278390]# gluster volume status
Status of volume: data
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick dcastor01:/xpool/data/brick           49153     0          Y       3076
Brick dcastor03:/xpool/data/brick           49153     0          Y       3019
Brick dcastor02:/xpool/data/bricky          49153     0          Y       3857
NFS Server on localhost                     2049      0          Y       3097
Self-heal Daemon on localhost               N/A       N/A        Y       3088
NFS Server on dcastor03                     2049      0          Y       3039
Self-heal Daemon on dcastor03               N/A       N/A        Y       3114
NFS Server on dcasrv02                      2049      0          Y       3871
Self-heal Daemon on dcasrv02                N/A       N/A        Y       3864

Task Status of Volume data
------------------------------------------------------------------------------
There are no active volume tasks

Status of volume: engine
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick dcastor01:/xpool/engine/brick         49152     0          Y       3131
Brick dcastor02:/xpool/engine/brick         49152     0          Y       3852
Brick dcastor03:/xpool/engine/brick         49152     0          Y       2992
NFS Server on localhost                     2049      0          Y       3097
Self-heal Daemon on localhost               N/A       N/A        Y       3088
NFS Server on dcastor03                     2049      0          Y       3039
Self-heal Daemon on dcastor03               N/A       N/A        Y       3114
NFS Server on dcasrv02                      2049      0          Y       3871
Self-heal Daemon on dcasrv02                N/A       N/A        Y       3864

Task Status of Volume engine
------------------------------------------------------------------------------
There are no active volume tasks

Status of volume: export
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick dcastor02:/xpool/export/brick         49155     0          Y       3872
Brick dcastor03:/xpool/export/brick         49155     0          Y       3147
Brick dcastor01:/xpool/export/brick         49155     0          Y       3150
NFS Server on localhost                     2049      0          Y       3097
Self-heal Daemon on localhost               N/A       N/A        Y       3088
NFS Server on dcastor03                     2049      0          Y       3039
Self-heal Daemon on dcastor03               N/A       N/A        Y       3114
NFS Server on dcasrv02                      2049      0          Y       3871
Self-heal Daemon on dcasrv02                N/A       N/A        Y       3864

Task Status of Volume export
------------------------------------------------------------------------------
There are no active volume tasks

Status of volume: iso
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick dcastor01:/xpool/iso/brick            49154     0          Y       3152
Brick dcastor02:/xpool/iso/brick            49154     0          Y       3881
Brick dcastor03:/xpool/iso/brick            49154     0          Y       3146
NFS Server on localhost                     2049      0          Y       3097
Self-heal Daemon on localhost               N/A       N/A        Y       3088
NFS Server on dcastor03                     2049      0          Y       3039
Self-heal Daemon on dcastor03               N/A       N/A        Y       3114
NFS Server on dcasrv02                      2049      0          Y       3871
Self-heal Daemon on dcasrv02                N/A       N/A        Y       3864

Task Status of Volume iso
------------------------------------------------------------------------------
There are no active volume tasks

Thanks

Jason

From: users-bounces@xxxxxxxxx [mailto:users-bounces@xxxxxxxxx] On Behalf Of Jason Jeffrey
Sent: 03 October 2016 18:40
To: users@xxxxxxxxx

Subject: Re: [ovirt-users] 4.0 - 2nd node fails on deploy

Hi,

Setup log attached for primary

Regards

Jason 

From: Simone Tiraboschi [mailto:stirabos@xxxxxxxxxx] 
Sent: 03 October 2016 09:27
To: Jason Jeffrey <jason@xxxxxxxxxx>
Cc: users <users@xxxxxxxxx>
Subject: Re: [ovirt-users] 4.0 - 2nd node fails on deploy

On Mon, Oct 3, 2016 at 12:45 AM, Jason Jeffrey <jason@xxxxxxxxxx> wrote:
Hi,

I am trying to build a x3 HC cluster, with a self hosted engine using gluster.

I have successful built the 1^st node,  however when I attempt to run hosted-engine –deploy on node 2, I get the following error

[WARNING] A configuration file must be supplied to deploy Hosted Engine on an additional host.
[ ERROR ] 'version' is not stored in the HE configuration image
[ ERROR ] Unable to get the answer file from the shared storage
[ ERROR ] Failed to execute stage 'Environment customization': Unable to get the answer file from the shared storage
[ INFO  ] Stage: Clean up
[ INFO  ] Generating answer file '/var/lib/ovirt-hosted-engine-setup/answers/answers-20161002232505.conf'
[ INFO  ] Stage: Pre-termination
[ INFO  ] Stage: Termination
[ ERROR ] Hosted Engine deployment failed    

Looking at the failure in the log file..

Can you please attach hosted-engine-setup logs from the first host?

2016-10-02 23:25:05 WARNING otopi.plugins.gr_he_common.core.remote_answerfile remote_answerfile._customization:151 A configuration
file must be supplied to deploy Hosted Engine on an additional host.
2016-10-02 23:25:05 DEBUG otopi.plugins.gr_he_common.core.remote_answerfile remote_answerfile._fetch_answer_file:61 _fetch_answer_f
ile
2016-10-02 23:25:05 DEBUG otopi.plugins.gr_he_common.core.remote_answerfile remote_answerfile._fetch_answer_file:69 fetching from:
/rhev/data-center/mnt/glusterSD/dcastor02:engine/0a021563-91b5-4f49-9c6b-fff45e85a025/images/f055216c-02f9-4cd1-a22c-d6b56a0a8e9b/7
8cb2527-a2e2-489a-9fad-465a72221b37
2016-10-02 23:25:05 DEBUG otopi.plugins.gr_he_common.core.remote_answerfile heconflib._dd_pipe_tar:69 executing: 'sudo -u vdsm dd i
f=/rhev/data-center/mnt/glusterSD/dcastor02:engine/0a021563-91b5-4f49-9c6b-fff45e85a025/images/f055216c-02f9-4cd1-a22c-d6b56a0a8e9b
/78cb2527-a2e2-489a-9fad-465a72221b37 bs=4k'
2016-10-02 23:25:05 DEBUG otopi.plugins.gr_he_common.core.remote_answerfile heconflib._dd_pipe_tar:70 executing: 'tar -tvf -'
2016-10-02 23:25:05 DEBUG otopi.plugins.gr_he_common.core.remote_answerfile heconflib._dd_pipe_tar:88 stdout:
2016-10-02 23:25:05 DEBUG otopi.plugins.gr_he_common.core.remote_answerfile heconflib._dd_pipe_tar:89 stderr:
2016-10-02 23:25:05 ERROR otopi.plugins.gr_he_common.core.remote_answerfile heconflib.validateConfImage:111 'version' is not stored
in the HE configuration image
2016-10-02 23:25:05 ERROR otopi.plugins.gr_he_common.core.remote_answerfile remote_answerfile._fetch_answer_file:73 Unable to get t
he answer file from the shared storage

Looking at the detected gluster path - /rhev/data-center/mnt/glusterSD/dcastor02:engine/0a021563-91b5-4f49-9c6b-fff45e85a025/images/f055216c-02f9-4cd1-a22c-d6b56a0a8e9b/

[root@dcasrv02 ~]# ls -al /rhev/data-center/mnt/glusterSD/dcastor02:engine/0a021563-91b5-4f49-9c6b-fff45e85a025/images/f055216c-02f9-4cd1-a22c-d6b56a0a8e9b/
total 1049609
drwxr-xr-x. 2 vdsm kvm       4096 Oct  2 04:46 .
drwxr-xr-x. 6 vdsm kvm       4096 Oct  2 04:46 ..
-rw-rw----. 1 vdsm kvm 1073741824 Oct  2 04:46 78cb2527-a2e2-489a-9fad-465a72221b37
-rw-rw----. 1 vdsm kvm    1048576 Oct  2 04:46 78cb2527-a2e2-489a-9fad-465a72221b37.lease
-rw-r--r--. 1 vdsm kvm        294 Oct  2 04:46 78cb2527-a2e2-489a-9fad-465a72221b37.meta  

78cb2527-a2e2-489a-9fad-465a72221b37 is  a 1 GB file, is this the engine VM ?

Copying the answers file form primary (/etc/ovirt-hosted-engine/answers.conf ) to  node 2 and rerunning produces the same error : (
(hosted-engine --deploy  --config-append=/root/answers.conf )

Also tried on node 3, same issues 

Happy to provide logs and other debugs

Thanks 

Jason 

_______________________________________________
Users mailing list
Users@xxxxxxxxx
http://lists.ovirt.org/mailman/listinfo/users

_______________________________________________
Users mailing list
Users@xxxxxxxxx
http://lists.ovirt.org/mailman/listinfo/users

_______________________________________________
Users mailing list
Users@xxxxxxxxx
http://lists.ovirt.org/mailman/listinfo/users

Attachment:
etc-glusterfs-glusterd.vol.log.gz

Description: Binary data
Attachment:
rhev-data-center-mnt-glusterSD-dcastor01%3Aengine.log.gz

Description: Binary data
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users