Hi, On Wed, Apr 18, 2007 at 04:04:13PM +0100, Patrick Caulfield wrote: > Jens Beyer wrote: > > > > I am using a vanilla 2.6.20.6 (same with 2.6.20.x). > > > > Hmm, I'm not sure how that got left unfixed upstream > > Here's the patch: > the Patch did fix one spinlock BUG; now I get an otherone: [ 315.040936] BUG: spinlock already unlocked on CPU#1, dlm_recvd/14593 [ 315.040949] lock: ee108f64, .magic: dead4ead, .owner: <none>/-1, .owner_cpu: -1 [ 315.040964] [<c01d62ac>] _raw_spin_unlock+0x70/0x72 [ 315.040976] [<f0b63f09>] dlm_lowcomms_commit_buffer+0x2f/0x9a [dlm] [ 315.040998] [<f0b5fb67>] send_rcom+0xa/0x12 [dlm] ... which seems to be fixed in 2.6.21-rc6 from where I got --- fs/dlm/lowcomms-tcp.c.orig 2007-04-19 09:42:53.000000000 +0200 +++ fs/dlm/lowcomms-tcp.c 2007-04-19 09:43:23.000000000 +0200 @@ -748,6 +748,7 @@ struct connection *con = e->con; int users; + spin_lock(&con->writequeue_lock); users = --e->users; if (users) goto out; But now it hangs during mount: boxfe01:/home/jbe # mount -t gfs2 -v /dev/sdd1 /export/vol1 /sbin/mount.gfs2: mount /dev/sdd1 /export/vol1 /sbin/mount.gfs2: parse_opts: opts = "rw" /sbin/mount.gfs2: clear flag 1 for "rw", flags = 0 /sbin/mount.gfs2: parse_opts: flags = 0 /sbin/mount.gfs2: parse_opts: extra = "" /sbin/mount.gfs2: parse_opts: hostdata = "" /sbin/mount.gfs2: parse_opts: lockproto = "" /sbin/mount.gfs2: parse_opts: locktable = "" /sbin/mount.gfs2: message to gfs_controld: asking to join mountgroup: /sbin/mount.gfs2: write "join /export/vol1 gfs2 lock_dlm boxfe:clustervol1 rw /dev/sdd1" /sbin/mount.gfs2: message from gfs_controld: response to join request: /sbin/mount.gfs2: lock_dlm_join: read "0" /sbin/mount.gfs2: message from gfs_controld: mount options: /sbin/mount.gfs2: lock_dlm_join: read "hostdata=jid=1:id=262146:first=0" /sbin/mount.gfs2: lock_dlm_join: hostdata: "hostdata=jid=1:id=262146:first=0" /sbin/mount.gfs2: lock_dlm_join: extra_plus: "hostdata=jid=1:id=262146:first=0" boxfe01:/home/jbe # dmesg | tail -15 [ 137.276428] GFS2 (built Apr 19 2007 09:15:21) installed [ 137.285199] Lock_DLM (built Apr 19 2007 09:15:33) installed [ 149.628806] drbd1: role( Secondary -> Primary ) [ 149.628827] drbd1: Writing meta data super block now. [ 156.324500] GFS2: fsid=: Trying to join cluster "lock_dlm", "boxfe:clustervol1" [ 156.397920] dlm: got connection from 2 [ 156.399738] dlm: clustervol1: recover 1 [ 156.399792] dlm: clustervol1: add member 2 [ 156.399796] dlm: clustervol1: add member 1 [ 156.400514] dlm: clustervol1: config mismatch: 32,0 nodeid 2: 11,0 [ 156.400519] dlm: clustervol1: ping_members aborted -22 last nodeid 2 [ 156.400523] dlm: clustervol1: total members 2 error -22 [ 156.400526] dlm: clustervol1: recover_members failed -22 [ 156.400529] dlm: clustervol1: recover 1 error -22 [ 156.404760] GFS2: fsid=boxfe:clustervol1.1: Joined cluster. Now mounting FS... Regards, Jens -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster