On Mon, Apr 18, 2011 at 8:38 AM, Terry <td3201@xxxxxxxxx> wrote: > On Mon, Apr 18, 2011 at 3:48 AM, Christine Caulfield > <ccaulfie@xxxxxxxxxx> wrote: >> On 17/04/11 21:52, Terry wrote: >>> >>> As a result of a strange situation where our licensing for storage >>> dropped off, I need to join a centos 5.6 node to a now single node >>> cluster. I got it joined to the cluster but I am having issues with >>> CLVMD. Any lvm operations on both boxes hang. For example, vgscan. >>> I have increased debugging and I don't see any logs. The VGs aren't >>> being populated in /dev/mapper. This WAS working right after I joined >>> it to the cluster and now it's not for some unknown reason. Not sure >>> where to take this at this point. I did find one weird startup log >>> that I am not sure what it means yet: >>> [root@omadvnfs01a ~]# dmesg | grep dlm >>> dlm: no local IP address has been set >>> dlm: cannot start dlm lowcomms -107 >>> dlm: Using TCP for communications >>> dlm: connecting to 2 >>> >> >> >> That message usually means that dlm_controld has failed to start. Try >> starting the cman daemons (groupd, dlm_controld) manually with the -D switch >> and read the output which might give some clues to why it's not working. >> >> Chrissie >> > > > Hi Chrissie, > > I thought of that but I see dlm started on both nodes. See right below. > >>> [root@omadvnfs01a ~]# ps xauwwww | grep dlm >>> root 5476 0.0 0.0 24736 760 ? Ss 15:34 0:00 >>> /sbin/dlm_controld >>> root 5502 0.0 0.0 0 0 ? S< 15:34 0:00 >>> [dlm_astd] >>> root 5503 0.0 0.0 0 0 ? S< 15:34 0:00 >>> [dlm_scand] >>> root 5504 0.0 0.0 0 0 ? S< 15:34 0:00 >>> [dlm_recv] >>> root 5505 0.0 0.0 0 0 ? S< 15:34 0:00 >>> [dlm_send] >>> root 5506 0.0 0.0 0 0 ? S< 15:34 0:00 >>> [dlm_recoverd] >>> root 5546 0.0 0.0 0 0 ? S< 15:35 0:00 >>> [dlm_recoverd] >>> >>> [root@omadvnfs01a ~]# lsmod | grep dlm >>> lock_dlm 52065 0 >>> gfs2 529037 1 lock_dlm >>> dlm 160065 17 lock_dlm >>> configfs 62045 2 dlm >>> >>> >>> centos server: >>> [root@omadvnfs01a ~]# rpm -q cman rgmanager lvm2-cluster >>> cman-2.0.115-68.el5 >>> rgmanager-2.0.52-9.el5.centos >>> lvm2-cluster-2.02.74-3.el5_6.1 >>> >>> [root@omadvnfs01a ~]# ls /dev/mapper/ | grep -v mpath >>> control >>> VolGroup00-LogVol00 >>> VolGroup00-LogVol01 >>> >>> rhel server: >>> [root@omadvnfs01b network-scripts]# rpm -q cman rgmanager lvm2-cluster >>> cman-2.0.115-34.el5 >>> rgmanager-2.0.52-6.el5 >>> lvm2-cluster-2.02.56-7.el5_5.4 >>> >>> [root@omadvnfs01b network-scripts]# ls /dev/mapper/ | grep -v mpath >>> control >>> vg_data01a-lv_data01a >>> vg_data01b-lv_data01b >>> vg_data01c-lv_data01c >>> vg_data01d-lv_data01d >>> vg_data01e-lv_data01e >>> vg_data01h-lv_data01h >>> vg_data01i-lv_data01i >>> VolGroup00-LogVol00 >>> VolGroup00-LogVol01 >>> VolGroup02-lv_data00 >>> >>> [root@omadvnfs01b network-scripts]# clustat >>> Cluster Status for omadvnfs01 @ Sun Apr 17 15:44:52 2011 >>> Member Status: Quorate >>> >>> Member Name ID >>> Status >>> ------ ---- ---- >>> ------ >>> omadvnfs01a.sec.jel.lc 1 >>> Online, rgmanager >>> omadvnfs01b.sec.jel.lc 2 >>> Online, Local, rgmanager >>> >>> Service Name >>> Owner (Last) State >>> ------- ---- >>> ----- ------ ----- >>> service:omadvnfs01-nfs-a >>> omadvnfs01b.sec.jel.lc >>> started >>> service:omadvnfs01-nfs-b >>> omadvnfs01b.sec.jel.lc >>> started >>> service:omadvnfs01-nfs-c >>> omadvnfs01b.sec.jel.lc >>> started >>> service:omadvnfs01-nfs-h >>> omadvnfs01b.sec.jel.lc >>> started >>> service:omadvnfs01-nfs-i >>> omadvnfs01b.sec.jel.lc >>> started >>> service:postgresql >>> omadvnfs01b.sec.jel.lc >>> started >>> >>> >>> [root@omadvnfs01a ~]# cman_tool nodes >>> Node Sts Inc Joined Name >>> 1 M 1892 2011-04-17 15:34:24 omadvnfs01a.sec.jel.lc >>> 2 M 1896 2011-04-17 15:34:24 omadvnfs01b.sec.jel.lc >>> >>> -- >>> Linux-cluster mailing list >>> Linux-cluster@xxxxxxxxxx >>> https://www.redhat.com/mailman/listinfo/linux-cluster >> >> -- >> Linux-cluster mailing list >> Linux-cluster@xxxxxxxxxx >> https://www.redhat.com/mailman/listinfo/linux-cluster >> > Ok, started all the CMAN elements manually as you suggested. I started them in order as in the init script. Here's the only error that I see. I can post the other debug messages if you think they'd be useful but this is the only one that stuck out to me. [root@omadvnfs01a ~]# /sbin/dlm_controld -D 1303134840 /sys/kernel/config/dlm/cluster/comms: opendir failed: 2 1303134840 /sys/kernel/config/dlm/cluster/spaces: opendir failed: 2 1303134840 set_ccs_options 480 1303134840 cman: node 2 added 1303134840 set_configfs_node 2 10.198.1.111 local 0 1303134840 cman: node 3 added 1303134840 set_configfs_node 3 10.198.1.110 local 1 -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster