I tried GFS2 on two-node cluster using GNBD. cfs1 - gnbd exports an IDE parition. Mount gfs2 directly on that IDE partition. cfs5 - gnbd imports the IDE parition. Mount gfs2 on top of gnbd device. I'm using, RHEL4 distro Linux kernel 2.6.20.1 (from kernel.org) cluster-2.00.00 (from tarball) udev-094 openais-0.80.2 Everything seems to be working fine. But when I mounted GFS2 on the 2nd node on top of gnbd device, I got these dlm related tracebacks. Plus dlm_recvd and dlm_sendd process are spinning cpu on both the boxes. Note the mount itself succeeded and I can use the filesystem from both the nodes. I know GFS2 is new, but anyone solution to this problem? I need to mention, I also see bunch of udev daemon related failure mesgs. I'm guessing it is due using it on old RHEL4 distribution? Not sure if that contributed to this spinlock problem reported here. Ultimately I want to run bonnie test on this configuration. But don't what to do that until the basic sanity of this GFS2 configuration is established. thanks, Sridhar cfs1: Mar 7 17:45:53 cfs1 kernel: BUG: spinlock already unlocked on CPU#1, dlm_recoverd/11046 Mar 7 17:45:53 cfs1 kernel: lock: cc6b68e4, .magic: dead4ead, .owner: <none>/-1, .owner_cpu: -1 Mar 7 17:45:53 cfs1 kernel: [<c01d3012>] _raw_spin_unlock+0x29/0x6b Mar 7 17:45:53 cfs1 kernel: [<e0969912>] dlm_lowcomms_get_buffer+0x6c/0xe7 [dlm] Mar 7 17:45:53 cfs1 kernel: [<e09661df>] create_rcom+0x2d/0xb3 [dlm] Mar 7 17:45:53 cfs1 kernel: [<e09663b0>] dlm_rcom_status+0x5a/0x10b [dlm] Mar 7 17:45:53 cfs1 kernel: [<e096593e>] make_member_array+0x84/0x14c [dlm] Mar 7 17:45:53 cfs1 kernel: [<e0965a3d>] ping_members+0x37/0x6e [dlm] Mar 7 17:45:53 cfs1 kernel: [<e0966e01>] dlm_set_recover_status+0x14/0x24 [dlm] Mar 7 17:45:53 cfs1 kernel: [<e0965bd8>] dlm_recover_members+0x164/0x1a1 [dlm] Mar 7 17:45:53 cfs1 kernel: [<e0967b4e>] ls_recover+0x67/0x2c6 [dlm] Mar 7 17:45:53 cfs1 kernel: [<e0967e0a>] do_ls_recovery+0x5d/0x75 [dlm] Mar 7 17:45:53 cfs1 kernel: [<e0967e22>] dlm_recoverd+0x0/0x74 [dlm] Mar 7 17:45:53 cfs1 kernel: [<e0967e7d>] dlm_recoverd+0x5b/0x74 [dlm] Mar 7 17:45:53 cfs1 kernel: [<c012e4aa>] kthread+0x72/0x96 Mar 7 17:45:53 cfs1 kernel: [<c012e438>] kthread+0x0/0x96 Mar 7 17:45:53 cfs1 kernel: [<c010405f>] kernel_thread_helper+0x7/0x10 cfs2: Mar 7 17:43:28 cfs5 gnbd_monitor[10552]: gnbd_monitor started. Monitoring device #0 Mar 7 17:43:28 cfs5 gnbd_recvd[10555]: gnbd_recvd started Mar 7 17:43:28 cfs5 kernel: resending requests Mar 7 17:43:33 cfs5 udevsend[10560]: starting udevd daemon Mar 7 17:43:35 cfs5 udevsend[10560]: unable to connect to event daemon, try to call udev directly Mar 7 17:45:51 cfs5 kernel: GFS2: fsid=: Trying to join cluster "lock_dlm", "ciscogfs2:hda9" Mar 7 17:45:53 cfs5 udevsend[10598]: starting udevd daemon Mar 7 17:45:53 cfs5 udevsend[10599]: starting udevd daemon Mar 7 17:45:53 cfs5 udevsend[10610]: starting udevd daemon Mar 7 17:45:53 cfs5 kernel: dlm: got connection from 1 Mar 7 17:45:53 cfs5 kernel: BUG: spinlock already unlocked on CPU#0, dlm_recvd/10593 Mar 7 17:45:53 cfs5 kernel: lock: c8d467e4, .magic: dead4ead, .owner: <none>/-1, .owner_cpu: -1 Mar 7 17:45:53 cfs5 kernel: [<c01d62d2>] _raw_spin_unlock+0x29/0x6b Mar 7 17:45:53 cfs5 kernel: [<e0c080a2>] dlm_lowcomms_get_buffer+0x6c/0xe7 [dlm] Mar 7 17:45:53 cfs5 kernel: [<e0c041f3>] create_rcom+0x2d/0xb3 [dlm] Mar 7 17:45:53 cfs5 kernel: [<e0c044a4>] receive_rcom_status+0x2f/0x74 [dlm] Mar 7 17:45:53 cfs5 kernel: [<e0c02dd6>] dlm_find_lockspace_global+0x3c/0x41 [dlm] Mar 7 17:45:53 cfs5 kernel: [<e0c04bca>] dlm_receive_rcom+0xc1/0x17f [dlm] Mar 7 17:45:53 cfs5 udevsend[10617]: starting udevd daemon Mar 7 17:45:54 cfs5 udevsend[10626]: starting udevd daemon Mar 7 17:45:54 cfs5 udevsend[10628]: starting udevd daemon Mar 7 17:45:55 cfs5 udevsend[10598]: unable to connect to event daemon, try to call udev directly Mar 7 17:45:55 cfs5 udevsend[10599]: unable to connect to event daemon, try to call udev directly Mar 7 17:45:55 cfs5 udevsend[10610]: unable to connect to event daemon, try to call udev directly Mar 7 17:45:55 cfs5 kernel: [<e0c04157>] dlm_process_incoming_buffer+0x148/0x1ad [dlm] Mar 7 17:45:59 cfs5 udevsend[10617]: unable to connect to event daemon, try to call udev directly Mar 7 17:46:00 cfs5 udevsend[10626]: unable to connect to event daemon, try to call udev directly Mar 7 17:46:02 cfs5 udevsend[10628]: unable to connect to event daemon, try to call udev directly Mar 7 17:46:05 cfs5 kernel: [<c012e862>] autoremove_wake_function+0x0/0x33 Mar 7 17:46:09 cfs5 kernel: [<c0146e7d>] __alloc_pages+0x61/0x2ad Mar 7 17:46:10 cfs5 kernel: [<e0c0790a>] receive_from_sock+0x178/0x246 [dlm] Mar 7 17:46:10 cfs5 kernel: [<e0c08470>] process_sockets+0x55/0x90 [dlm] Mar 7 17:46:11 cfs5 kernel: [<e0c085c6>] dlm_recvd+0x0/0x69 [dlm] Mar 7 17:46:11 cfs5 kernel: [<e0c08620>] dlm_recvd+0x5a/0x69 [dlm] Mar 7 17:46:12 cfs5 kernel: [<c012e51a>] kthread+0x72/0x96 Mar 7 17:46:12 cfs5 kernel: [<c012e4a8>] kthread+0x0/0x96 Mar 7 17:46:13 cfs5 kernel: [<c010405f>] kernel_thread_helper+0x7/0x10 Mar 7 17:46:14 cfs5 kernel: ======================= Mar 7 17:46:14 cfs5 kernel: dlm: hda9: recover 1 Mar 7 17:46:15 cfs5 kernel: dlm: hda9: add member 1 Mar 7 17:46:15 cfs5 kernel: dlm: hda9: add member 2 Mar 7 17:46:16 cfs5 kernel: dlm: hda9: total members 2 error 0 Mar 7 17:46:17 cfs5 kernel: dlm: hda9: dlm_recover_directory Mar 7 17:46:18 cfs5 kernel: dlm: hda9: dlm_recover_directory 12 entries Mar 7 17:46:19 cfs5 kernel: GFS2: fsid=ciscogfs2:hda9.1: Joined cluster. Now mounting FS... Mar 7 17:46:20 cfs5 kernel: dlm: hda9: recover 1 done: 348 ms Mar 7 17:46:21 cfs5 kernel: GFS2: fsid=ciscogfs2:hda9.1: jid=1, already locked for use Mar 7 17:46:22 cfs5 kernel: GFS2: fsid=ciscogfs2:hda9.1: jid=1: Looking at journal... Mar 7 17:46:23 cfs5 kernel: GFS2: fsid=ciscogfs2:hda9.1: jid=1: Done > -----Original Message----- > From: linux-cluster-bounces@xxxxxxxxxx > [mailto:linux-cluster-bounces@xxxxxxxxxx] On Behalf Of Dan Merillat > Sent: Wednesday, January 24, 2007 9:08 PM > To: linux-kernel@xxxxxxxxxxxxxxx > Cc: linux-cluster@xxxxxxxxxx > Subject: 2.6.20-rc4 gfs2 bug > > Running 2.6.20-rc4 _WITH_ the following patch: (Shouldn't be > the issue, > but just in case, I'm listing it here) > -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster