Based on the logs I can guess that glusterd is being started before the network has come up and that the addresses given to bricks do not directly match the addresses used in during peer probe. The gluster_after_reboot log has the line "[2014-11-25 06:46:09.972113] E [glusterd-store.c:2632:glusterd_resolve_all_bricks] 0-glusterd: resolve brick failed in restore". Brick resolution fails when glusterd cannot match the address for the brick, with one of the peers. Brick resolution happens in two phases, 1. We first try to identify the peer by performing string comparisions with the brick address and the peer addresses (The peer names will be the names/addresses that were given when the peer was probed). 2. If we don't find a match from step 1, we will then resolve all the brick address and the peer addresses into addrinfo structs, and then compare these structs to find a match. This process should generally find a match if available. This will fail only if the network is not up yet as we cannot resolve addresses. The above steps are applicable only to glusterfs versions >=3.6. They were introduced to reduce problems with peer identification, like the one you encountered Since both of the steps failed to find a match in one run, but succeeded later, we can come to the conclusion that, a) the bricks don't have the exact same string used in peer probe for their addresses as step 1 failed, and b) the network was not up in the initial run, as step 2 failed during the initial run, but passed in the second run. Please let me know if my conclusion is correct. If it is, you can solve your problem in two ways. 1. Use the same string for doing the peer probe and for the brick address during volume create/add-brick. Ideally, we suggest you use properly resolvable FQDNs everywhere. If that is not possible, then use only IP addresses. Try to avoid short names. 2. During boot up, make sure to launch glusterd only after the network is up. This will allow the new peer identification mechanism to do its job correctly. If you have already followed these steps and yet still hit the problem, then please provide more information (setup, logs, etc.). It could be much different problem that you are facing. ~kaushal On Wed, Nov 26, 2014 at 4:01 PM, Punit Dambiwal <hypunit@xxxxxxxxx> wrote: > Is there any one can help on this ?? > > Thanks, > punit > > On Wed, Nov 26, 2014 at 9:42 AM, Punit Dambiwal <hypunit@xxxxxxxxx> wrote: >> >> Hi, >> >> My Glusterfs version is :- glusterfs-3.6.1-1.el7 >> >> On Wed, Nov 26, 2014 at 1:59 AM, Kanagaraj Mayilsamy <kmayilsa@xxxxxxxxxx> >> wrote: >>> >>> [+Gluster-users@xxxxxxxxxxx] >>> >>> "Initialization of volume 'management' failed, review your volfile >>> again", glusterd throws this error when the service is started automatically >>> after the reboot. But the service is successfully started later manually by >>> the user. >>> >>> can somebody from gluster-users please help on this? >>> >>> glusterfs version: 3.5.1 >>> >>> Thanks, >>> Kanagaraj >>> >>> ----- Original Message ----- >>> > From: "Punit Dambiwal" <hypunit@xxxxxxxxx> >>> > To: "Kanagaraj" <kmayilsa@xxxxxxxxxx> >>> > Cc: users@xxxxxxxxx >>> > Sent: Tuesday, November 25, 2014 7:24:45 PM >>> > Subject: Re: [ovirt-users] Gluster command [<UNKNOWN>] failed on >>> > server... >>> > >>> > Hi Kanagraj, >>> > >>> > Please check the attached log files....i didn't find any thing >>> > special.... >>> > >>> > On Tue, Nov 25, 2014 at 12:12 PM, Kanagaraj <kmayilsa@xxxxxxxxxx> >>> > wrote: >>> > >>> > > Do you see any errors in >>> > > /var/log/glusterfs/etc-glusterfs-glusterd.vol.log or vdsm.log when >>> > > the >>> > > service is trying to start automatically after the reboot? >>> > > >>> > > Thanks, >>> > > Kanagaraj >>> > > >>> > > >>> > > On 11/24/2014 08:13 PM, Punit Dambiwal wrote: >>> > > >>> > > Hi Kanagaraj, >>> > > >>> > > Yes...once i will start the gluster service and then vdsmd ...the >>> > > host >>> > > can connect to cluster...but the question is why it's not started >>> > > even it >>> > > has chkconfig enabled... >>> > > >>> > > I have tested it in two host cluster environment...(Centos 6.6 and >>> > > centos 7.0) on both hypervisior cluster..it's failed to reconnect in >>> > > to >>> > > cluster after reboot.... >>> > > >>> > > In both the environment glusterd enabled for next boot....but it's >>> > > failed with the same error....seems it's bug in either gluster or >>> > > Ovirt ?? >>> > > >>> > > Please help me to find the workaround here if can not resolve >>> > > it...as >>> > > without this the Host machine can not connect after reboot....that >>> > > means >>> > > engine will consider it as down and every time need to manually start >>> > > the >>> > > gluster service and vdsmd... ?? >>> > > >>> > > Thanks, >>> > > Punit >>> > > >>> > > On Mon, Nov 24, 2014 at 10:20 PM, Kanagaraj <kmayilsa@xxxxxxxxxx> >>> > > wrote: >>> > > >>> > >> From vdsm.log "error: Connection failed. Please check if gluster >>> > >> daemon >>> > >> is operational." >>> > >> >>> > >> Starting glusterd service should fix this issue. 'service glusterd >>> > >> start' >>> > >> But i am wondering why the glusterd was not started automatically >>> > >> after >>> > >> the reboot. >>> > >> >>> > >> Thanks, >>> > >> Kanagaraj >>> > >> >>> > >> >>> > >> >>> > >> On 11/24/2014 07:18 PM, Punit Dambiwal wrote: >>> > >> >>> > >> Hi Kanagaraj, >>> > >> >>> > >> Please find the attached VDSM logs :- >>> > >> >>> > >> ---------------- >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> >>> > >> 21:41:17,182::resourceManager::977::Storage.ResourceManager.Owner::(cancelAll) >>> > >> Owner.cancelAll requests {} >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> 21:41:17,182::task::993::Storage.TaskManager.Task::(_decref) >>> > >> Task=`1691d409-9b27-4585-8281-5ec26154367a`::ref 0 aborting False >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> 21:41:32,393::task::595::Storage.TaskManager.Task::(_updateState) >>> > >> Task=`994c7bc3-a236-4d03-a732-e068c7ed9ed4`::moving from state init >>> > >> -> >>> > >> state preparing >>> > >> Thread-13::INFO::2014-11-24 >>> > >> 21:41:32,393::logUtils::44::dispatcher::(wrapper) Run and protect: >>> > >> repoStats(options=None) >>> > >> Thread-13::INFO::2014-11-24 >>> > >> 21:41:32,393::logUtils::47::dispatcher::(wrapper) Run and protect: >>> > >> repoStats, Return response: {} >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> 21:41:32,393::task::1191::Storage.TaskManager.Task::(prepare) >>> > >> Task=`994c7bc3-a236-4d03-a732-e068c7ed9ed4`::finished: {} >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> 21:41:32,394::task::595::Storage.TaskManager.Task::(_updateState) >>> > >> Task=`994c7bc3-a236-4d03-a732-e068c7ed9ed4`::moving from state >>> > >> preparing >>> > >> -> >>> > >> state finished >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> >>> > >> 21:41:32,394::resourceManager::940::Storage.ResourceManager.Owner::(releaseAll) >>> > >> Owner.releaseAll requests {} resources {} >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> >>> > >> 21:41:32,394::resourceManager::977::Storage.ResourceManager.Owner::(cancelAll) >>> > >> Owner.cancelAll requests {} >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> 21:41:32,394::task::993::Storage.TaskManager.Task::(_decref) >>> > >> Task=`994c7bc3-a236-4d03-a732-e068c7ed9ed4`::ref 0 aborting False >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> 21:41:41,550::BindingXMLRPC::1132::vds::(wrapper) client >>> > >> [10.10.10.2]::call >>> > >> getCapabilities with () {} >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> 21:41:41,553::utils::738::root::(execCmd) >>> > >> /sbin/ip route show to 0.0.0.0/0 table all (cwd None) >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> 21:41:41,560::utils::758::root::(execCmd) >>> > >> SUCCESS: <err> = ''; <rc> = 0 >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> 21:41:41,588::caps::728::root::(_getKeyPackages) rpm package >>> > >> ('gluster-swift',) not found >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> 21:41:41,592::caps::728::root::(_getKeyPackages) rpm package >>> > >> ('gluster-swift-object',) not found >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> 21:41:41,593::caps::728::root::(_getKeyPackages) rpm package >>> > >> ('gluster-swift-plugin',) not found >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> 21:41:41,598::caps::728::root::(_getKeyPackages) rpm package >>> > >> ('gluster-swift-account',) not found >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> 21:41:41,598::caps::728::root::(_getKeyPackages) rpm package >>> > >> ('gluster-swift-proxy',) not found >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> 21:41:41,598::caps::728::root::(_getKeyPackages) rpm package >>> > >> ('gluster-swift-doc',) not found >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> 21:41:41,599::caps::728::root::(_getKeyPackages) rpm package >>> > >> ('gluster-swift-container',) not found >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> 21:41:41,599::caps::728::root::(_getKeyPackages) rpm package >>> > >> ('glusterfs-geo-replication',) not found >>> > >> Thread-13::DEBUG::2014-11-24 21:41:41,600::caps::646::root::(get) >>> > >> VirtioRNG DISABLED: libvirt version 0.10.2-29.el6_5.9 required >= >>> > >> 0.10.2-31 >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> 21:41:41,603::BindingXMLRPC::1139::vds::(wrapper) return >>> > >> getCapabilities >>> > >> with {'status': {'message': 'Done', 'code': 0}, 'info': >>> > >> {'HBAInventory': >>> > >> {'iSCSI': [{'InitiatorName': >>> > >> 'iqn.1994-05.com.redhat:32151ce183c8'}], >>> > >> 'FC': >>> > >> []}, 'packages2': {'kernel': {'release': '431.el6.x86_64', >>> > >> 'buildtime': >>> > >> 1385061309.0, 'version': '2.6.32'}, 'glusterfs-rdma': {'release': >>> > >> '1.el6', >>> > >> 'buildtime': 1403622628L, 'version': '3.5.1'}, 'glusterfs-fuse': >>> > >> {'release': '1.el6', 'buildtime': 1403622628L, 'version': '3.5.1'}, >>> > >> 'spice-server': {'release': '6.el6_5.2', 'buildtime': 1402324637L, >>> > >> 'version': '0.12.4'}, 'vdsm': {'release': '1.gitdb83943.el6', >>> > >> 'buildtime': >>> > >> 1412784567L, 'version': '4.16.7'}, 'qemu-kvm': {'release': >>> > >> '2.415.el6_5.10', 'buildtime': 1402435700L, 'version': '0.12.1.2'}, >>> > >> 'qemu-img': {'release': '2.415.el6_5.10', 'buildtime': 1402435700L, >>> > >> 'version': '0.12.1.2'}, 'libvirt': {'release': '29.el6_5.9', >>> > >> 'buildtime': >>> > >> 1402404612L, 'version': '0.10.2'}, 'glusterfs': {'release': '1.el6', >>> > >> 'buildtime': 1403622628L, 'version': '3.5.1'}, 'mom': {'release': >>> > >> '2.el6', >>> > >> 'buildtime': 1403794344L, 'version': '0.4.1'}, 'glusterfs-server': >>> > >> {'release': '1.el6', 'buildtime': 1403622628L, 'version': '3.5.1'}}, >>> > >> 'numaNodeDistance': {'1': [20, 10], '0': [10, 20]}, 'cpuModel': >>> > >> 'Intel(R) >>> > >> Xeon(R) CPU X5650 @ 2.67GHz', 'liveMerge': 'false', >>> > >> 'hooks': >>> > >> {}, >>> > >> 'cpuSockets': '2', 'vmTypes': ['kvm'], 'selinux': {'mode': '1'}, >>> > >> 'kdumpStatus': 0, 'supportedProtocols': ['2.2', '2.3'], 'networks': >>> > >> {'ovirtmgmt': {'iface': u'bond0.10', 'addr': '43.252.176.16', >>> > >> 'bridged': >>> > >> False, 'ipv6addrs': ['fe80::62eb:69ff:fe20:b46c/64'], 'mtu': '1500', >>> > >> 'bootproto4': 'none', 'netmask': '255.255.255.0', 'ipv4addrs': [' >>> > >> 43.252.176.16/24' <http://43.252.176.16/24%27>], 'interface': >>> > >> u'bond0.10', 'ipv6gateway': '::', 'gateway': '43.25.17.1'}, >>> > >> 'Internal': >>> > >> {'iface': 'Internal', 'addr': '', 'cfg': {'DEFROUTE': 'no', >>> > >> 'HOTPLUG': >>> > >> 'no', 'MTU': '9000', 'DELAY': '0', 'NM_CONTROLLED': 'no', >>> > >> 'BOOTPROTO': >>> > >> 'none', 'STP': 'off', 'DEVICE': 'Internal', 'TYPE': 'Bridge', >>> > >> 'ONBOOT': >>> > >> 'no'}, 'bridged': True, 'ipv6addrs': >>> > >> ['fe80::210:18ff:fecd:daac/64'], >>> > >> 'gateway': '', 'bootproto4': 'none', 'netmask': '', 'stp': 'off', >>> > >> 'ipv4addrs': [], 'mtu': '9000', 'ipv6gateway': '::', 'ports': >>> > >> ['bond1.100']}, 'storage': {'iface': u'bond1', 'addr': '10.10.10.6', >>> > >> 'bridged': False, 'ipv6addrs': ['fe80::210:18ff:fecd:daac/64'], >>> > >> 'mtu': >>> > >> '9000', 'bootproto4': 'none', 'netmask': '255.255.255.0', >>> > >> 'ipv4addrs': [' >>> > >> 10.10.10.6/24' <http://10.10.10.6/24%27>], 'interface': u'bond1', >>> > >> 'ipv6gateway': '::', 'gateway': ''}, 'VMNetwork': {'iface': >>> > >> 'VMNetwork', >>> > >> 'addr': '', 'cfg': {'DEFROUTE': 'no', 'HOTPLUG': 'no', 'MTU': >>> > >> '1500', >>> > >> 'DELAY': '0', 'NM_CONTROLLED': 'no', 'BOOTPROTO': 'none', 'STP': >>> > >> 'off', >>> > >> 'DEVICE': 'VMNetwork', 'TYPE': 'Bridge', 'ONBOOT': 'no'}, 'bridged': >>> > >> True, >>> > >> 'ipv6addrs': ['fe80::62eb:69ff:fe20:b46c/64'], 'gateway': '', >>> > >> 'bootproto4': >>> > >> 'none', 'netmask': '', 'stp': 'off', 'ipv4addrs': [], 'mtu': '1500', >>> > >> 'ipv6gateway': '::', 'ports': ['bond0.36']}}, 'bridges': >>> > >> {'Internal': >>> > >> {'addr': '', 'cfg': {'DEFROUTE': 'no', 'HOTPLUG': 'no', 'MTU': >>> > >> '9000', >>> > >> 'DELAY': '0', 'NM_CONTROLLED': 'no', 'BOOTPROTO': 'none', 'STP': >>> > >> 'off', >>> > >> 'DEVICE': 'Internal', 'TYPE': 'Bridge', 'ONBOOT': 'no'}, >>> > >> 'ipv6addrs': >>> > >> ['fe80::210:18ff:fecd:daac/64'], 'mtu': '9000', 'netmask': '', >>> > >> 'stp': >>> > >> 'off', 'ipv4addrs': [], 'ipv6gateway': '::', 'gateway': '', 'opts': >>> > >> {'topology_change_detected': '0', 'multicast_last_member_count': >>> > >> '2', >>> > >> 'hash_elasticity': '4', 'multicast_query_response_interval': '999', >>> > >> 'multicast_snooping': '1', 'multicast_startup_query_interval': >>> > >> '3124', >>> > >> 'hello_timer': '31', 'multicast_querier_interval': '25496', >>> > >> 'max_age': >>> > >> '1999', 'hash_max': '512', 'stp_state': '0', 'root_id': >>> > >> '8000.001018cddaac', 'priority': '32768', >>> > >> 'multicast_membership_interval': >>> > >> '25996', 'root_path_cost': '0', 'root_port': '0', >>> > >> 'multicast_querier': >>> > >> '0', >>> > >> 'multicast_startup_query_count': '2', 'hello_time': '199', >>> > >> 'topology_change': '0', 'bridge_id': '8000.001018cddaac', >>> > >> 'topology_change_timer': '0', 'ageing_time': '29995', 'gc_timer': >>> > >> '31', >>> > >> 'group_addr': '1:80:c2:0:0:0', 'tcn_timer': '0', >>> > >> 'multicast_query_interval': '12498', >>> > >> 'multicast_last_member_interval': >>> > >> '99', 'multicast_router': '1', 'forward_delay': '0'}, 'ports': >>> > >> ['bond1.100']}, 'VMNetwork': {'addr': '', 'cfg': {'DEFROUTE': 'no', >>> > >> 'HOTPLUG': 'no', 'MTU': '1500', 'DELAY': '0', 'NM_CONTROLLED': 'no', >>> > >> 'BOOTPROTO': 'none', 'STP': 'off', 'DEVICE': 'VMNetwork', 'TYPE': >>> > >> 'Bridge', >>> > >> 'ONBOOT': 'no'}, 'ipv6addrs': ['fe80::62eb:69ff:fe20:b46c/64'], >>> > >> 'mtu': >>> > >> '1500', 'netmask': '', 'stp': 'off', 'ipv4addrs': [], 'ipv6gateway': >>> > >> '::', >>> > >> 'gateway': '', 'opts': {'topology_change_detected': '0', >>> > >> 'multicast_last_member_count': '2', 'hash_elasticity': '4', >>> > >> 'multicast_query_response_interval': '999', 'multicast_snooping': >>> > >> '1', >>> > >> 'multicast_startup_query_interval': '3124', 'hello_timer': '131', >>> > >> 'multicast_querier_interval': '25496', 'max_age': '1999', >>> > >> 'hash_max': >>> > >> '512', 'stp_state': '0', 'root_id': '8000.60eb6920b46c', 'priority': >>> > >> '32768', 'multicast_membership_interval': '25996', 'root_path_cost': >>> > >> '0', >>> > >> 'root_port': '0', 'multicast_querier': '0', >>> > >> 'multicast_startup_query_count': '2', 'hello_time': '199', >>> > >> 'topology_change': '0', 'bridge_id': '8000.60eb6920b46c', >>> > >> 'topology_change_timer': '0', 'ageing_time': '29995', 'gc_timer': >>> > >> '31', >>> > >> 'group_addr': '1:80:c2:0:0:0', 'tcn_timer': '0', >>> > >> 'multicast_query_interval': '12498', >>> > >> 'multicast_last_member_interval': >>> > >> '99', 'multicast_router': '1', 'forward_delay': '0'}, 'ports': >>> > >> ['bond0.36']}}, 'uuid': '44454C4C-4C00-1057-8053-B7C04F504E31', >>> > >> 'lastClientIface': 'bond1', 'nics': {'eth3': {'permhwaddr': >>> > >> '00:10:18:cd:da:ae', 'addr': '', 'cfg': {'SLAVE': 'yes', >>> > >> 'NM_CONTROLLED': >>> > >> 'no', 'MTU': '9000', 'HWADDR': '00:10:18:cd:da:ae', 'MASTER': >>> > >> 'bond1', >>> > >> 'DEVICE': 'eth3', 'ONBOOT': 'no'}, 'ipv6addrs': [], 'mtu': '9000', >>> > >> 'netmask': '', 'ipv4addrs': [], 'hwaddr': '00:10:18:cd:da:ac', >>> > >> 'speed': >>> > >> 1000}, 'eth2': {'permhwaddr': '00:10:18:cd:da:ac', 'addr': '', >>> > >> 'cfg': >>> > >> {'SLAVE': 'yes', 'NM_CONTROLLED': 'no', 'MTU': '9000', 'HWADDR': >>> > >> '00:10:18:cd:da:ac', 'MASTER': 'bond1', 'DEVICE': 'eth2', 'ONBOOT': >>> > >> 'no'}, >>> > >> 'ipv6addrs': [], 'mtu': '9000', 'netmask': '', 'ipv4addrs': [], >>> > >> 'hwaddr': >>> > >> '00:10:18:cd:da:ac', 'speed': 1000}, 'eth1': {'permhwaddr': >>> > >> '60:eb:69:20:b4:6d', 'addr': '', 'cfg': {'SLAVE': 'yes', >>> > >> 'NM_CONTROLLED': >>> > >> 'no', 'MTU': '1500', 'HWADDR': '60:eb:69:20:b4:6d', 'MASTER': >>> > >> 'bond0', >>> > >> 'DEVICE': 'eth1', 'ONBOOT': 'yes'}, 'ipv6addrs': [], 'mtu': '1500', >>> > >> 'netmask': '', 'ipv4addrs': [], 'hwaddr': '60:eb:69:20:b4:6c', >>> > >> 'speed': >>> > >> 1000}, 'eth0': {'permhwaddr': '60:eb:69:20:b4:6c', 'addr': '', >>> > >> 'cfg': >>> > >> {'SLAVE': 'yes', 'NM_CONTROLLED': 'no', 'MTU': '1500', 'HWADDR': >>> > >> '60:eb:69:20:b4:6c', 'MASTER': 'bond0', 'DEVICE': 'eth0', 'ONBOOT': >>> > >> 'yes'}, >>> > >> 'ipv6addrs': [], 'mtu': '1500', 'netmask': '', 'ipv4addrs': [], >>> > >> 'hwaddr': >>> > >> '60:eb:69:20:b4:6c', 'speed': 1000}}, 'software_revision': '1', >>> > >> 'clusterLevels': ['3.0', '3.1', '3.2', '3.3', '3.4', '3.5'], >>> > >> 'cpuFlags': >>> > >> >>> > >> u'fpu,vme,de,pse,tsc,msr,pae,mce,cx8,apic,sep,mtrr,pge,mca,cmov,pat,pse36,clflush,dts,acpi,mmx,fxsr,sse,sse2,ss,ht,tm,pbe,syscall,nx,pdpe1gb,rdtscp,lm,constant_tsc,arch_perfmon,pebs,bts,rep_good,xtopology,nonstop_tsc,pni,pclmulqdq,dtes64,monitor,ds_cpl,vmx,smx,est,tm2,ssse3,cx16,xtpr,pdcm,pcid,dca,sse4_1,sse4_2,popcnt,aes,lahf_lm,tpr_shadow,vnmi,flexpriority,ept,vpid,model_Nehalem,model_Conroe,model_coreduo,model_core2duo,model_Penryn,model_Westmere,model_n270', >>> > >> 'ISCSIInitiatorName': 'iqn.1994-05.com.redhat:32151ce183c8', >>> > >> 'netConfigDirty': 'False', 'supportedENGINEs': ['3.0', '3.1', '3.2', >>> > >> '3.3', >>> > >> '3.4', '3.5'], 'autoNumaBalancing': 2, 'reservedMem': '321', >>> > >> 'bondings': >>> > >> {'bond4': {'addr': '', 'cfg': {}, 'mtu': '1500', 'netmask': '', >>> > >> 'slaves': >>> > >> [], 'hwaddr': '00:00:00:00:00:00'}, 'bond0': {'addr': '', 'cfg': >>> > >> {'HOTPLUG': 'no', 'MTU': '1500', 'NM_CONTROLLED': 'no', >>> > >> 'BONDING_OPTS': >>> > >> 'mode=4 miimon=100', 'DEVICE': 'bond0', 'ONBOOT': 'yes'}, >>> > >> 'ipv6addrs': >>> > >> ['fe80::62eb:69ff:fe20:b46c/64'], 'mtu': '1500', 'netmask': '', >>> > >> 'ipv4addrs': [], 'hwaddr': '60:eb:69:20:b4:6c', 'slaves': ['eth0', >>> > >> 'eth1'], >>> > >> 'opts': {'miimon': '100', 'mode': '4'}}, 'bond1': {'addr': >>> > >> '10.10.10.6', >>> > >> 'cfg': {'DEFROUTE': 'no', 'IPADDR': '10.10.10.6', 'HOTPLUG': 'no', >>> > >> 'MTU': >>> > >> '9000', 'NM_CONTROLLED': 'no', 'NETMASK': '255.255.255.0', >>> > >> 'BOOTPROTO': >>> > >> 'none', 'BONDING_OPTS': 'mode=4 miimon=100', 'DEVICE': 'bond1', >>> > >> 'ONBOOT': >>> > >> 'no'}, 'ipv6addrs': ['fe80::210:18ff:fecd:daac/64'], 'mtu': '9000', >>> > >> 'netmask': '255.255.255.0', 'ipv4addrs': ['10.10.10.6/24' >>> > >> <http://10.10.10.6/24%27>], 'hwaddr': '00:10:18:cd:da:ac', 'slaves': >>> > >> ['eth2', 'eth3'], 'opts': {'miimon': '100', 'mode': '4'}}, 'bond2': >>> > >> {'addr': '', 'cfg': {}, 'mtu': '1500', 'netmask': '', 'slaves': [], >>> > >> 'hwaddr': '00:00:00:00:00:00'}, 'bond3': {'addr': '', 'cfg': {}, >>> > >> 'mtu': >>> > >> '1500', 'netmask': '', 'slaves': [], 'hwaddr': >>> > >> '00:00:00:00:00:00'}}, >>> > >> 'software_version': '4.16', 'memSize': '24019', 'cpuSpeed': >>> > >> '2667.000', >>> > >> 'numaNodes': {u'1': {'totalMemory': '12288', 'cpus': [6, 7, 8, 9, >>> > >> 10, 11, >>> > >> 18, 19, 20, 21, 22, 23]}, u'0': {'totalMemory': '12278', 'cpus': [0, >>> > >> 1, 2, >>> > >> 3, 4, 5, 12, 13, 14, 15, 16, 17]}}, 'version_name': 'Snow Man', >>> > >> 'vlans': >>> > >> {'bond0.10': {'iface': 'bond0', 'addr': '43.25.17.16', 'cfg': >>> > >> {'DEFROUTE': >>> > >> 'yes', 'VLAN': 'yes', 'IPADDR': '43.25.17.16', 'HOTPLUG': 'no', >>> > >> 'GATEWAY': >>> > >> '43.25.17.1', 'NM_CONTROLLED': 'no', 'NETMASK': '255.255.255.0', >>> > >> 'BOOTPROTO': 'none', 'DEVICE': 'bond0.10', 'MTU': '1500', 'ONBOOT': >>> > >> 'yes'}, >>> > >> 'ipv6addrs': ['fe80::62eb:69ff:fe20:b46c/64'], 'vlanid': 10, 'mtu': >>> > >> '1500', >>> > >> 'netmask': '255.255.255.0', 'ipv4addrs': ['43.25.17.16/24'] >>> > >> <http://43.25.17.16/24%27%5D>}, 'bond0.36': {'iface': 'bond0', >>> > >> 'addr': >>> > >> '', 'cfg': {'BRIDGE': 'VMNetwork', 'VLAN': 'yes', 'HOTPLUG': 'no', >>> > >> 'MTU': >>> > >> '1500', 'NM_CONTROLLED': 'no', 'DEVICE': 'bond0.36', 'ONBOOT': >>> > >> 'no'}, >>> > >> 'ipv6addrs': ['fe80::62eb:69ff:fe20:b46c/64'], 'vlanid': 36, 'mtu': >>> > >> '1500', >>> > >> 'netmask': '', 'ipv4addrs': []}, 'bond1.100': {'iface': 'bond1', >>> > >> 'addr': >>> > >> '', 'cfg': {'BRIDGE': 'Internal', 'VLAN': 'yes', 'HOTPLUG': 'no', >>> > >> 'MTU': >>> > >> '9000', 'NM_CONTROLLED': 'no', 'DEVICE': 'bond1.100', 'ONBOOT': >>> > >> 'no'}, >>> > >> 'ipv6addrs': ['fe80::210:18ff:fecd:daac/64'], 'vlanid': 100, 'mtu': >>> > >> '9000', >>> > >> 'netmask': '', 'ipv4addrs': []}}, 'cpuCores': '12', 'kvmEnabled': >>> > >> 'true', >>> > >> 'guestOverhead': '65', 'cpuThreads': '24', 'emulatedMachines': >>> > >> [u'rhel6.5.0', u'pc', u'rhel6.4.0', u'rhel6.3.0', u'rhel6.2.0', >>> > >> u'rhel6.1.0', u'rhel6.0.0', u'rhel5.5.0', u'rhel5.4.4', >>> > >> u'rhel5.4.0'], >>> > >> 'operatingSystem': {'release': '5.el6.centos.11.1', 'version': '6', >>> > >> 'name': >>> > >> 'RHEL'}, 'lastClient': '10.10.10.2'}} >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> 21:41:41,620::BindingXMLRPC::1132::vds::(wrapper) client >>> > >> [10.10.10.2]::call >>> > >> getHardwareInfo with () {} >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> 21:41:41,621::BindingXMLRPC::1139::vds::(wrapper) return >>> > >> getHardwareInfo >>> > >> with {'status': {'message': 'Done', 'code': 0}, 'info': >>> > >> {'systemProductName': 'CS24-TY', 'systemSerialNumber': '7LWSPN1', >>> > >> 'systemFamily': 'Server', 'systemVersion': 'A00', 'systemUUID': >>> > >> '44454c4c-4c00-1057-8053-b7c04f504e31', 'systemManufacturer': >>> > >> 'Dell'}} >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> 21:41:41,733::BindingXMLRPC::1132::vds::(wrapper) client >>> > >> [10.10.10.2]::call >>> > >> hostsList with () {} flowID [222e8036] >>> > >> Thread-13::ERROR::2014-11-24 >>> > >> 21:41:44,753::BindingXMLRPC::1148::vds::(wrapper) vdsm exception >>> > >> occured >>> > >> Traceback (most recent call last): >>> > >> File "/usr/share/vdsm/rpc/BindingXMLRPC.py", line 1135, in wrapper >>> > >> res = f(*args, **kwargs) >>> > >> File "/usr/share/vdsm/gluster/api.py", line 54, in wrapper >>> > >> rv = func(*args, **kwargs) >>> > >> File "/usr/share/vdsm/gluster/api.py", line 251, in hostsList >>> > >> return {'hosts': self.svdsmProxy.glusterPeerStatus()} >>> > >> File "/usr/share/vdsm/supervdsm.py", line 50, in __call__ >>> > >> return callMethod() >>> > >> File "/usr/share/vdsm/supervdsm.py", line 48, in <lambda> >>> > >> **kwargs) >>> > >> File "<string>", line 2, in glusterPeerStatus >>> > >> File "/usr/lib64/python2.6/multiprocessing/managers.py", line 740, >>> > >> in >>> > >> _callmethod >>> > >> raise convert_to_error(kind, result) >>> > >> GlusterCmdExecFailedException: Command execution failed >>> > >> error: Connection failed. Please check if gluster daemon is >>> > >> operational. >>> > >> return code: 1 >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> 21:41:50,949::task::595::Storage.TaskManager.Task::(_updateState) >>> > >> Task=`c9042986-c978-4b08-adb2-616f5299e115`::moving from state init >>> > >> -> >>> > >> state preparing >>> > >> Thread-13::INFO::2014-11-24 >>> > >> 21:41:50,950::logUtils::44::dispatcher::(wrapper) Run and protect: >>> > >> repoStats(options=None) >>> > >> Thread-13::INFO::2014-11-24 >>> > >> 21:41:50,950::logUtils::47::dispatcher::(wrapper) Run and protect: >>> > >> repoStats, Return response: {} >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> 21:41:50,950::task::1191::Storage.TaskManager.Task::(prepare) >>> > >> Task=`c9042986-c978-4b08-adb2-616f5299e115`::finished: {} >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> 21:41:50,950::task::595::Storage.TaskManager.Task::(_updateState) >>> > >> Task=`c9042986-c978-4b08-adb2-616f5299e115`::moving from state >>> > >> preparing >>> > >> -> >>> > >> state finished >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> >>> > >> 21:41:50,951::resourceManager::940::Storage.ResourceManager.Owner::(releaseAll) >>> > >> Owner.releaseAll requests {} resources {} >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> >>> > >> 21:41:50,951::resourceManager::977::Storage.ResourceManager.Owner::(cancelAll) >>> > >> Owner.cancelAll requests {} >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> 21:41:50,951::task::993::Storage.TaskManager.Task::(_decref) >>> > >> Task=`c9042986-c978-4b08-adb2-616f5299e115`::ref 0 aborting False >>> > >> ------------------------------- >>> > >> >>> > >> [root@compute4 ~]# service glusterd status >>> > >> glusterd is stopped >>> > >> [root@compute4 ~]# chkconfig --list | grep glusterd >>> > >> glusterd 0:off 1:off 2:on 3:on 4:on 5:on >>> > >> 6:off >>> > >> [root@compute4 ~]# >>> > >> >>> > >> Thanks, >>> > >> Punit >>> > >> >>> > >> On Mon, Nov 24, 2014 at 6:36 PM, Kanagaraj <kmayilsa@xxxxxxxxxx> >>> > >> wrote: >>> > >> >>> > >>> Can you send the corresponding error in vdsm.log from the host? >>> > >>> >>> > >>> Also check if glusterd service is running. >>> > >>> >>> > >>> Thanks, >>> > >>> Kanagaraj >>> > >>> >>> > >>> >>> > >>> On 11/24/2014 03:39 PM, Punit Dambiwal wrote: >>> > >>> >>> > >>> Hi, >>> > >>> >>> > >>> After reboot my Hypervisior host can not activate again in the >>> > >>> cluster >>> > >>> and failed with the following error :- >>> > >>> >>> > >>> Gluster command [<UNKNOWN>] failed on server... >>> > >>> >>> > >>> Engine logs :- >>> > >>> >>> > >>> 2014-11-24 18:05:28,397 INFO >>> > >>> >>> > >>> [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand] >>> > >>> (DefaultQuartzScheduler_Worker-64) START, >>> > >>> GlusterVolumesListVDSCommand(HostName = Compute4, HostId = >>> > >>> 33648a90-200c-45ca-89d5-1ce305d79a6a), log id: 5f251c90 >>> > >>> 2014-11-24 18:05:30,609 INFO >>> > >>> >>> > >>> [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand] >>> > >>> (DefaultQuartzScheduler_Worker-64) FINISH, >>> > >>> GlusterVolumesListVDSCommand, >>> > >>> return: >>> > >>> >>> > >>> {26ae1672-ee09-4a38-8fd2-72dd9974cc2b=org.ovirt.engine.core.common.businessentities.gluster.GlusterVolumeEntity@d95203e0}, >>> > >>> log id: 5f251c90 >>> > >>> 2014-11-24 18:05:33,768 INFO >>> > >>> [org.ovirt.engine.core.bll.ActivateVdsCommand] >>> > >>> (ajp--127.0.0.1-8702-8) >>> > >>> [287d570d] Lock Acquired to object EngineLock [exclusiveLocks= key: >>> > >>> 0bf6b00f-7947-4411-b55a-cc5eea2b381a value: VDS >>> > >>> , sharedLocks= ] >>> > >>> 2014-11-24 18:05:33,795 INFO >>> > >>> [org.ovirt.engine.core.bll.ActivateVdsCommand] >>> > >>> (org.ovirt.thread.pool-8-thread-45) [287d570d] Running command: >>> > >>> ActivateVdsCommand internal: false. Entities affected : ID: >>> > >>> 0bf6b00f-7947-4411-b55a-cc5eea2b381a Type: VDSAction group >>> > >>> MANIPULATE_HOST >>> > >>> with role type ADMIN >>> > >>> 2014-11-24 18:05:33,796 INFO >>> > >>> [org.ovirt.engine.core.bll.ActivateVdsCommand] >>> > >>> (org.ovirt.thread.pool-8-thread-45) [287d570d] Before acquiring >>> > >>> lock in >>> > >>> order to prevent monitoring for host Compute5 from data-center >>> > >>> SV_WTC >>> > >>> 2014-11-24 18:05:33,797 INFO >>> > >>> [org.ovirt.engine.core.bll.ActivateVdsCommand] >>> > >>> (org.ovirt.thread.pool-8-thread-45) [287d570d] Lock acquired, from >>> > >>> now a >>> > >>> monitoring of host will be skipped for host Compute5 from >>> > >>> data-center >>> > >>> SV_WTC >>> > >>> 2014-11-24 18:05:33,817 INFO >>> > >>> [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] >>> > >>> (org.ovirt.thread.pool-8-thread-45) [287d570d] START, >>> > >>> SetVdsStatusVDSCommand(HostName = Compute5, HostId = >>> > >>> 0bf6b00f-7947-4411-b55a-cc5eea2b381a, status=Unassigned, >>> > >>> nonOperationalReason=NONE, stopSpmFailureLogged=false), log id: >>> > >>> 1cbc7311 >>> > >>> 2014-11-24 18:05:33,820 INFO >>> > >>> [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] >>> > >>> (org.ovirt.thread.pool-8-thread-45) [287d570d] FINISH, >>> > >>> SetVdsStatusVDSCommand, log id: 1cbc7311 >>> > >>> 2014-11-24 18:05:34,086 INFO >>> > >>> [org.ovirt.engine.core.bll.ActivateVdsCommand] >>> > >>> (org.ovirt.thread.pool-8-thread-45) Activate finished. Lock >>> > >>> released. >>> > >>> Monitoring can run now for host Compute5 from data-center SV_WTC >>> > >>> 2014-11-24 18:05:34,088 INFO >>> > >>> >>> > >>> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] >>> > >>> (org.ovirt.thread.pool-8-thread-45) Correlation ID: 287d570d, Job >>> > >>> ID: >>> > >>> 5ef8e4d6-b2bc-469e-8e81-7ef74b2a001a, Call Stack: null, Custom >>> > >>> Event ID: >>> > >>> -1, Message: Host Compute5 was activated by admin. >>> > >>> 2014-11-24 18:05:34,090 INFO >>> > >>> [org.ovirt.engine.core.bll.ActivateVdsCommand] >>> > >>> (org.ovirt.thread.pool-8-thread-45) Lock freed to object EngineLock >>> > >>> [exclusiveLocks= key: 0bf6b00f-7947-4411-b55a-cc5eea2b381a value: >>> > >>> VDS >>> > >>> , sharedLocks= ] >>> > >>> 2014-11-24 18:05:35,792 INFO >>> > >>> >>> > >>> [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand] >>> > >>> (DefaultQuartzScheduler_Worker-55) [3706e836] START, >>> > >>> GlusterVolumesListVDSCommand(HostName = Compute4, HostId = >>> > >>> 33648a90-200c-45ca-89d5-1ce305d79a6a), log id: 48a0c832 >>> > >>> 2014-11-24 18:05:37,064 INFO >>> > >>> >>> > >>> [org.ovirt.engine.core.vdsbroker.vdsbroker.GetHardwareInfoVDSCommand] >>> > >>> (DefaultQuartzScheduler_Worker-69) START, >>> > >>> GetHardwareInfoVDSCommand(HostName = Compute5, HostId = >>> > >>> 0bf6b00f-7947-4411-b55a-cc5eea2b381a, >>> > >>> vds=Host[Compute5,0bf6b00f-7947-4411-b55a-cc5eea2b381a]), log id: >>> > >>> 6d560cc2 >>> > >>> 2014-11-24 18:05:37,074 INFO >>> > >>> >>> > >>> [org.ovirt.engine.core.vdsbroker.vdsbroker.GetHardwareInfoVDSCommand] >>> > >>> (DefaultQuartzScheduler_Worker-69) FINISH, >>> > >>> GetHardwareInfoVDSCommand, log >>> > >>> id: 6d560cc2 >>> > >>> 2014-11-24 18:05:37,093 WARN >>> > >>> [org.ovirt.engine.core.vdsbroker.VdsManager] >>> > >>> (DefaultQuartzScheduler_Worker-69) Host Compute5 is running with >>> > >>> disabled >>> > >>> SELinux. >>> > >>> 2014-11-24 18:05:37,127 INFO >>> > >>> >>> > >>> [org.ovirt.engine.core.bll.HandleVdsCpuFlagsOrClusterChangedCommand] >>> > >>> (DefaultQuartzScheduler_Worker-69) [2b4a51cf] Running command: >>> > >>> HandleVdsCpuFlagsOrClusterChangedCommand internal: true. Entities >>> > >>> affected >>> > >>> : ID: 0bf6b00f-7947-4411-b55a-cc5eea2b381a Type: VDS >>> > >>> 2014-11-24 18:05:37,147 INFO >>> > >>> >>> > >>> [org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand] >>> > >>> (DefaultQuartzScheduler_Worker-69) [2b4a51cf] START, >>> > >>> GlusterServersListVDSCommand(HostName = Compute5, HostId = >>> > >>> 0bf6b00f-7947-4411-b55a-cc5eea2b381a), log id: 4faed87 >>> > >>> 2014-11-24 18:05:37,164 INFO >>> > >>> >>> > >>> [org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand] >>> > >>> (DefaultQuartzScheduler_Worker-69) [2b4a51cf] FINISH, >>> > >>> GlusterServersListVDSCommand, log id: 4faed87 >>> > >>> 2014-11-24 18:05:37,189 INFO >>> > >>> [org.ovirt.engine.core.bll.SetNonOperationalVdsCommand] >>> > >>> (DefaultQuartzScheduler_Worker-69) [4a84c4e5] Running command: >>> > >>> SetNonOperationalVdsCommand internal: true. Entities affected : >>> > >>> ID: >>> > >>> 0bf6b00f-7947-4411-b55a-cc5eea2b381a Type: VDS >>> > >>> 2014-11-24 18:05:37,206 INFO >>> > >>> [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] >>> > >>> (DefaultQuartzScheduler_Worker-69) [4a84c4e5] START, >>> > >>> SetVdsStatusVDSCommand(HostName = Compute5, HostId = >>> > >>> 0bf6b00f-7947-4411-b55a-cc5eea2b381a, status=NonOperational, >>> > >>> nonOperationalReason=GLUSTER_COMMAND_FAILED, >>> > >>> stopSpmFailureLogged=false), >>> > >>> log id: fed5617 >>> > >>> 2014-11-24 18:05:37,209 INFO >>> > >>> [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] >>> > >>> (DefaultQuartzScheduler_Worker-69) [4a84c4e5] FINISH, >>> > >>> SetVdsStatusVDSCommand, log id: fed5617 >>> > >>> 2014-11-24 18:05:37,223 ERROR >>> > >>> >>> > >>> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] >>> > >>> (DefaultQuartzScheduler_Worker-69) [4a84c4e5] Correlation ID: >>> > >>> 4a84c4e5, >>> > >>> Job >>> > >>> ID: 4bfd4a6d-c3ef-468f-a40e-a3a6ca13011b, Call Stack: null, Custom >>> > >>> Event >>> > >>> ID: -1, Message: Gluster command [<UNKNOWN>] failed on server >>> > >>> Compute5. >>> > >>> 2014-11-24 18:05:37,243 INFO >>> > >>> >>> > >>> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] >>> > >>> (DefaultQuartzScheduler_Worker-69) [4a84c4e5] Correlation ID: null, >>> > >>> Call >>> > >>> Stack: null, Custom Event ID: -1, Message: Status of host Compute5 >>> > >>> was >>> > >>> set >>> > >>> to NonOperational. >>> > >>> 2014-11-24 18:05:37,272 INFO >>> > >>> [org.ovirt.engine.core.bll.HandleVdsVersionCommand] >>> > >>> (DefaultQuartzScheduler_Worker-69) [a0c8a7f] Running command: >>> > >>> HandleVdsVersionCommand internal: true. Entities affected : ID: >>> > >>> 0bf6b00f-7947-4411-b55a-cc5eea2b381a Type: VDS >>> > >>> 2014-11-24 18:05:37,274 INFO >>> > >>> [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] >>> > >>> (DefaultQuartzScheduler_Worker-69) [a0c8a7f] Host >>> > >>> 0bf6b00f-7947-4411-b55a-cc5eea2b381a : Compute5 is already in >>> > >>> NonOperational status for reason GLUSTER_COMMAND_FAILED. >>> > >>> SetNonOperationalVds command is skipped. >>> > >>> 2014-11-24 18:05:38,065 INFO >>> > >>> >>> > >>> [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand] >>> > >>> (DefaultQuartzScheduler_Worker-55) [3706e836] FINISH, >>> > >>> GlusterVolumesListVDSCommand, return: >>> > >>> >>> > >>> {26ae1672-ee09-4a38-8fd2-72dd9974cc2b=org.ovirt.engine.core.common.businessentities.gluster.GlusterVolumeEntity@4e72a1b1}, >>> > >>> log id: 48a0c832 >>> > >>> 2014-11-24 18:05:43,243 INFO >>> > >>> >>> > >>> [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand] >>> > >>> (DefaultQuartzScheduler_Worker-35) START, >>> > >>> GlusterVolumesListVDSCommand(HostName = Compute4, HostId = >>> > >>> 33648a90-200c-45ca-89d5-1ce305d79a6a), log id: 3ce13ebc >>> > >>> ^C >>> > >>> [root@ccr01 ~]# >>> > >>> >>> > >>> Thanks, >>> > >>> Punit >>> > >>> >>> > >>> >>> > >>> _______________________________________________ >>> > >>> Users mailing >>> > >>> listUsers@ovirt.orghttp://lists.ovirt.org/mailman/listinfo/users >>> > >>> >>> > >>> >>> > >>> >>> > >> >>> > >> >>> > > >>> > > >>> > >> >> > > > _______________________________________________ > Gluster-users mailing list > Gluster-users@xxxxxxxxxxx > http://supercolony.gluster.org/mailman/listinfo/gluster-users _______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://supercolony.gluster.org/mailman/listinfo/gluster-users