On Mon, Apr 18, 2011 at 9:13 AM, Kaloyan Kovachev <kkovachev@xxxxxxxxx> wrote: > On Mon, 18 Apr 2011 08:57:34 -0500, Terry <td3201@xxxxxxxxx> wrote: >> On Mon, Apr 18, 2011 at 8:38 AM, Terry <td3201@xxxxxxxxx> wrote: >>> On Mon, Apr 18, 2011 at 3:48 AM, Christine Caulfield >>> <ccaulfie@xxxxxxxxxx> wrote: >>>> On 17/04/11 21:52, Terry wrote: >>>>> >>>>> As a result of a strange situation where our licensing for storage >>>>> dropped off, I need to join a centos 5.6 node to a now single node >>>>> cluster. I got it joined to the cluster but I am having issues with >>>>> CLVMD. Any lvm operations on both boxes hang. For example, vgscan. >>>>> I have increased debugging and I don't see any logs. The VGs aren't >>>>> being populated in /dev/mapper. This WAS working right after I > joined >>>>> it to the cluster and now it's not for some unknown reason. Not sure >>>>> where to take this at this point. I did find one weird startup log >>>>> that I am not sure what it means yet: >>>>> [root@omadvnfs01a ~]# dmesg | grep dlm >>>>> dlm: no local IP address has been set >>>>> dlm: cannot start dlm lowcomms -107 >>>>> dlm: Using TCP for communications >>>>> dlm: connecting to 2 >>>>> >>>> >>>> >>>> That message usually means that dlm_controld has failed to start. Try >>>> starting the cman daemons (groupd, dlm_controld) manually with the -D >>>> switch >>>> and read the output which might give some clues to why it's not > working. >>>> >>>> Chrissie >>>> >>> >>> >>> Hi Chrissie, >>> >>> I thought of that but I see dlm started on both nodes. See right > below. >>> >>>>> [root@omadvnfs01a ~]# ps xauwwww | grep dlm >>>>> root 5476 0.0 0.0 24736 760 ? Ss 15:34 0:00 >>>>> /sbin/dlm_controld >>>>> root 5502 0.0 0.0 0 0 ? S< 15:34 0:00 >>>>> [dlm_astd] >>>>> root 5503 0.0 0.0 0 0 ? S< 15:34 0:00 >>>>> [dlm_scand] >>>>> root 5504 0.0 0.0 0 0 ? S< 15:34 0:00 >>>>> [dlm_recv] >>>>> root 5505 0.0 0.0 0 0 ? S< 15:34 0:00 >>>>> [dlm_send] >>>>> root 5506 0.0 0.0 0 0 ? S< 15:34 0:00 >>>>> [dlm_recoverd] >>>>> root 5546 0.0 0.0 0 0 ? S< 15:35 0:00 >>>>> [dlm_recoverd] >>>>> >>>>> [root@omadvnfs01a ~]# lsmod | grep dlm >>>>> lock_dlm 52065 0 >>>>> gfs2 529037 1 lock_dlm >>>>> dlm 160065 17 lock_dlm >>>>> configfs 62045 2 dlm >>>>> >>>>> >>>>> centos server: >>>>> [root@omadvnfs01a ~]# rpm -q cman rgmanager lvm2-cluster >>>>> cman-2.0.115-68.el5 >>>>> rgmanager-2.0.52-9.el5.centos >>>>> lvm2-cluster-2.02.74-3.el5_6.1 >>>>> >>>>> [root@omadvnfs01a ~]# ls /dev/mapper/ | grep -v mpath >>>>> control >>>>> VolGroup00-LogVol00 >>>>> VolGroup00-LogVol01 >>>>> >>>>> rhel server: >>>>> [root@omadvnfs01b network-scripts]# rpm -q cman rgmanager > lvm2-cluster >>>>> cman-2.0.115-34.el5 >>>>> rgmanager-2.0.52-6.el5 >>>>> lvm2-cluster-2.02.56-7.el5_5.4 >>>>> >>>>> [root@omadvnfs01b network-scripts]# ls /dev/mapper/ | grep -v mpath >>>>> control >>>>> vg_data01a-lv_data01a >>>>> vg_data01b-lv_data01b >>>>> vg_data01c-lv_data01c >>>>> vg_data01d-lv_data01d >>>>> vg_data01e-lv_data01e >>>>> vg_data01h-lv_data01h >>>>> vg_data01i-lv_data01i >>>>> VolGroup00-LogVol00 >>>>> VolGroup00-LogVol01 >>>>> VolGroup02-lv_data00 >>>>> >>>>> [root@omadvnfs01b network-scripts]# clustat >>>>> Cluster Status for omadvnfs01 @ Sun Apr 17 15:44:52 2011 >>>>> Member Status: Quorate >>>>> >>>>> Member Name ID >>>>> Status >>>>> ------ ---- ---- >>>>> ------ >>>>> omadvnfs01a.sec.jel.lc > 1 >>>>> Online, rgmanager >>>>> omadvnfs01b.sec.jel.lc > 2 >>>>> Online, Local, rgmanager >>>>> >>>>> Service Name >>>>> Owner (Last) > State >>>>> ------- ---- >>>>> ----- ------ > ----- >>>>> service:omadvnfs01-nfs-a >>>>> omadvnfs01b.sec.jel.lc >>>>> started >>>>> service:omadvnfs01-nfs-b >>>>> omadvnfs01b.sec.jel.lc >>>>> started >>>>> service:omadvnfs01-nfs-c >>>>> omadvnfs01b.sec.jel.lc >>>>> started >>>>> service:omadvnfs01-nfs-h >>>>> omadvnfs01b.sec.jel.lc >>>>> started >>>>> service:omadvnfs01-nfs-i >>>>> omadvnfs01b.sec.jel.lc >>>>> started >>>>> service:postgresql >>>>> omadvnfs01b.sec.jel.lc >>>>> started >>>>> >>>>> >>>>> [root@omadvnfs01a ~]# cman_tool nodes >>>>> Node Sts Inc Joined Name >>>>> 1 M 1892 2011-04-17 15:34:24 omadvnfs01a.sec.jel.lc >>>>> 2 M 1896 2011-04-17 15:34:24 omadvnfs01b.sec.jel.lc >>>>> >>>>> -- >>>>> Linux-cluster mailing list >>>>> Linux-cluster@xxxxxxxxxx >>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>> >>>> -- >>>> Linux-cluster mailing list >>>> Linux-cluster@xxxxxxxxxx >>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>> >>> >> >> >> >> Ok, started all the CMAN elements manually as you suggested. I >> started them in order as in the init script. Here's the only error >> that I see. I can post the other debug messages if you think they'd >> be useful but this is the only one that stuck out to me. >> >> [root@omadvnfs01a ~]# /sbin/dlm_controld -D >> 1303134840 /sys/kernel/config/dlm/cluster/comms: opendir failed: 2 >> 1303134840 /sys/kernel/config/dlm/cluster/spaces: opendir failed: 2 > > what does "lsmod | egrep -e 'configfs' -e 'dlm'" say? > >> 1303134840 set_ccs_options 480 >> 1303134840 cman: node 2 added >> 1303134840 set_configfs_node 2 10.198.1.111 local 0 >> 1303134840 cman: node 3 added >> 1303134840 set_configfs_node 3 10.198.1.110 local 1 >> >> -- [root@omadvnfs01a ~]# lsmod | egrep -e 'configfs' -e 'dlm' lock_dlm 52065 0 gfs2 529037 1 lock_dlm dlm 160065 5 lock_dlm configfs 62045 2 dlm [root@omadvnfs01b log]# lsmod | egrep -e 'configfs' -e 'dlm' lock_dlm 52065 0 gfs2 524204 1 lock_dlm dlm 160065 19 gfs,lock_dlm configfs 62045 2 dlm -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster