Re: [ovirt-users] Can we debug some truths/myths/facts about hosted-engine and gluster?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



​​

On Fri, Jul 18, 2014 at 10:06 PM, Vijay Bellur <vbellur@xxxxxxxxxx> wrote:
[Adding gluster-devel]


On 07/18/2014 05:20 PM, Andrew Lau wrote:
Hi all,

As most of you have got hints from previous messages, hosted engine
won't work on gluster . A quote from BZ1097639

"Using hosted engine with Gluster backed storage is currently something
we really warn against.


I think this bug should be closed or re-targeted at documentation, because there is nothing we can do here. Hosted engine assumes that all writes are atomic and (immediately) available for all hosts in the cluster. Gluster violates those assumptions.
​"
I tried going through BZ1097639 but could not find much detail with respect to gluster there.

A few questions around the problem:

1. Can somebody please explain in detail the scenario that causes the problem?

2. Is hosted engine performing synchronous writes to ensure that writes are durable?

Also, if there is any documentation that details the hosted engine architecture that would help in enhancing our understanding of its interactions with gluster.




Now my question, does this theory prevent a scenario of perhaps
something like a gluster replicated volume being mounted as a glusterfs
filesystem and then re-exported as the native kernel NFS share for the
hosted-engine to consume? It could then be possible to chuck ctdb in
there to provide a last resort failover solution. I have tried myself
and suggested it to two people who are running a similar setup. Now
using the native kernel NFS server for hosted-engine and they haven't
reported as many issues. Curious, could anyone validate my theory on this?


If we obtain more details on the use case and obtain gluster logs from the failed scenarios, we should be able to understand the problem better. That could be the first step in validating your theory or evolving further recommendations :).


​I'm not sure how useful this is, but ​Jiri Moskovcak tracked this down in an off list message.

​Message Quote:​

​==​

We were able to track it down to this (thanks Andrew for providing the testing setup):

-b686-4363-bb7e-dba99e5789b6/ha_agent service_type=hosted-engine'
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py", line 165, in handle
    response = "success " + self._dispatch(data)
  File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py", line 261, in _dispatch
    .get_all_stats_for_service_type(**options)
  File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py", line 41, in get_all_stats_for_service_type
    d = self.get_raw_stats_for_service_type(storage_dir, service_type)
  File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py", line 74, in get_raw_stats_for_service_type
    f = os.open(path, direct_flag | os.O_RDONLY)
OSError: [Errno 116] Stale file handle: '/rhev/data-center/mnt/localhost:_mnt_hosted-engine/c898fd2a-b686-4363-bb7e-dba99e5789b6/ha_agent/hosted-engine.metadata'

It's definitely connected to the storage which leads us to the gluster, I'm not very familiar with the gluster so I need to check this with our gluster gurus.

​==​

 
Thanks,
Vijay

_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux