Hi, I have a 2-node cluster setup and trying to get GFS2 working on top of an iSCSI volume. Each node is a Xen virtual machine. I am currently unable to get clvmd working on the 2nd node. It starts fine on the 1st node: [root@vm1 ~]# service clvmd start Starting clvmd: [ OK ] Activating VGs: Logging initialised at Wed Mar 2 15:25:07 2011 Set umask to 0077 Finding all volume groups Finding volume group "PcbiHomesVG" Activated 1 logical volumes in volume group PcbiHomesVG 1 logical volume(s) in volume group "PcbiHomesVG" now active Finding volume group "VolGroup00" 2 logical volume(s) in volume group "VolGroup00" already active 2 existing logical volume(s) in volume group "VolGroup00" monitored Activated 2 logical volumes in volume group VolGroup00 2 logical volume(s) in volume group "VolGroup00" now active Wiping internal VG cache [root@vm1 ~]# vgs Logging initialised at Wed Mar 2 15:25:12 2011 Set umask to 0077 Finding all volume groups Finding volume group "PcbiHomesVG" Finding volume group "VolGroup00" VG #PV #LV #SN Attr VSize VFree PcbiHomesVG 1 1 0 wz--nc 1.17T 0 VolGroup00 1 2 0 wz--n- 4.66G 0 Wiping internal VG cache But when I try to start clvmd on the 2nd node, it hangs: [root@vm2 ~]# service clvmd start Starting clvmd: [ OK ] ...hangs... I see the following in vm2:/var/log/messages: Mar 2 15:59:02 vm2 clvmd[2283]: Cluster LVM daemon started - connected to CMAN Mar 2 16:01:36 vm2 kernel: INFO: task clvmd:2302 blocked for more than 120 seconds. Mar 2 16:01:36 vm2 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Mar 2 16:01:36 vm2 kernel: clvmd D 0022a86125f49a6a 0 2302 1 2299 (NOTLB) Mar 2 16:01:36 vm2 kernel: ffff880030cb7db8 0000000000000282 0000000000000000 0000000000000000 Mar 2 16:01:36 vm2 kernel: 0000000000000008 ffff880033e327e0 ffff880000033080 000000000001c2b2 Mar 2 16:01:36 vm2 kernel: ffff880033e329c8 ffffffff8029c48f Mar 2 16:01:36 vm2 kernel: Call Trace: Mar 2 16:01:36 vm2 kernel: [<ffffffff8029c48f>] autoremove_wake_function+0x0/0x2e Mar 2 16:01:36 vm2 kernel: [<ffffffff802644cb>] __down_read+0x82/0x9a Mar 2 16:01:36 vm2 kernel: [<ffffffff884f646d>] :dlm:dlm_user_request+0x2d/0x174 Mar 2 16:01:36 vm2 kernel: [<ffffffff8022d08d>] mntput_no_expire+0x19/0x89 Mar 2 16:01:36 vm2 kernel: [<ffffffff8041716d>] sys_sendto+0x14a/0x164 Mar 2 16:01:36 vm2 kernel: [<ffffffff884fd61f>] :dlm:device_write+0x2f5/0x5e5 Mar 2 16:01:36 vm2 kernel: [<ffffffff80217379>] vfs_write+0xce/0x174 Mar 2 16:01:36 vm2 kernel: [<ffffffff80217bb1>] sys_write+0x45/0x6e Mar 2 16:01:36 vm2 kernel: [<ffffffff802602f9>] tracesys+0xab/0xb6 [...] I also noticed that there's a waiting "vgscan" process that "clvmd" is waiting on: 1 1655 1655 1655 ? -1 Ss 0 0:00 /usr/sbin/sshd 1655 1801 1801 1801 ? -1 Ss 0 0:00 \_ sshd: root@pts/0 1801 1803 1803 1803 pts/0 2187 Ss 0 0:00 | \_ -bash 1803 2187 2187 1803 pts/0 2187 S+ 0 0:00 | \_ /bin/sh /sbin/service clvmd start 2187 2192 2187 1803 pts/0 2187 S+ 0 0:00 | \_ /bin/bash /etc/init.d/clvmd start 2192 2215 2187 1803 pts/0 2187 S+ 0 0:00 | \_ /usr/sbin/vgscan Before starting clvmd, cman is started and both nodes are cluster members: [root@vm1 ~]# cman_tool nodes Node Sts Inc Joined Name 1 M 544456 2011-03-02 15:24:31 172.16.50.32 2 M 544468 2011-03-02 15:52:29 172.16.50.33 Note that I'm using manual fencing in this configuration. Both nodes are running CentOS 5.5: # uname -a Linux vm2.pcbi.upenn.edu 2.6.18-194.32.1.el5xen #1 SMP Wed Jan 5 18:44:24 EST 2011 x86_64 x86_64 x86_64 GNU/Linux These package versions were installed on each node: cman-2.0.115-34.el5_5.4 cman-devel-2.0.115-34.el5_5.4 gfs2-utils-0.1.62-20.el5 lvm2-2.02.56-8.el5_5.6 lvm2-cluster-2.02.56-7.el5_5.4 rgmanager-2.0.52-6.el5.centos.8 system-config-cluster-1.0.57-3.el5_5.1 iptables is turned off on each node. Does anyone know why clvmd hangs on the 2nd node? Best, -- Valeriu Mutu -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster