I have not been able to get my tests to run for more than 1 day for the last several tries. This time my test hung during mount in kcl_join_service(). My test does mount and umount several times for each test run. This time it hung on the 22nd test run. It looks like it was starting a 3node test where a gfs file system is mounted on all 3 nodes and then does a umount/mount 1 node at a time. So this should have done an umount on cl031 and then hung on a mount on cl031 with cl030 and cl032 having the gfs file system still mounted. The mount stack trace is: mount D C170EF9C 0 1557 1 12111 3932 (NOTLB) f09b9c30 00000086 f1f4c580 c170ef9c 00002ca3 c2c39ae0 00000008 00000000 e947e548 7b01b78b 00002ca3 f09b9c10 f1f4c580 00000000 c170f8c0 c170ef60 00000000 00003ba8 7b01b9ea 00002ca3 c2c39ae0 c2c39c4c 00000000 00002ca3 Call Trace: [<c03ce814>] wait_for_completion+0xa4/0xe0 [<f8ab6164>] kcl_join_service+0x154/0x180 [cman] [<f8890fff>] init_mountgroup+0x6f/0xc0 [lock_dlm] [<f88934b1>] lm_dlm_mount+0xa1/0xf0 [lock_dlm] [<f8812300>] lm_mount+0x140/0x230 [lock_harness] [<f9017f4d>] gfs_lm_mount+0x1fd/0x390 [gfs] [<f9024276>] fill_super+0x596/0x14c0 [gfs] [<f902533f>] gfs_get_sb+0x15f/0x1b0 [gfs] [<c0166ae8>] do_kern_mount+0x58/0xe0 [<c017ce08>] do_new_mount+0x98/0xe0 [<c017d4b5>] do_mount+0x165/0x1b0 [<c017d8c7>] sys_mount+0x97/0x100 [<c010323d>] sysenter_past_esp+0x52/0x75 A bunch of info is available here: http://developer.osdl.org/daniel/GFS/test.11feb2005/ The bad news is that taking a stack trace to a serial console causes nodes to be kicked out of the cluster, so some of the info has the nodes being kicked out. Any ideas on how to figure this out? Daniel