On Mon, 2004-08-09 at 08:12, Michael Conrad Tadpol Tilstra wrote: > On Fri, Aug 06, 2004 at 04:31:55PM -0700, micah nerren wrote: > > So it appears to be specifically related to lock_gulm. > > hrms, so no pushing this off onto someone else. oh well. ;) > > > > Anything else I should try? > well, it still pretty much looks like a stack overflow. And looking at > the calling tree, there is not much left to take out of the stacks. So > I guess we'll have to try making the stack shorter. > > So, another patch. This still works on my intels, give it a go and > lets see how it does on your opterons. > > > I really appreciate all your help in debugging this! > np. > I tried the patch, it still crashes with the same oops. However, I tried something I hadn't tried before which may shed some light on this. I rebooted the system into UP mode, loaded the UP modules, and did the mount of the file system. This time, no oops. It still doesn't work, but the machine lives. The mount process simply hangs. When I go to another terminal and kill the mount process, this appears in the syslog: lock_gulm: ERROR cm_login failed. -512 lock_gulm: ERROR Got a -512 trying to start the threads. lock_gulm: fsid=hopkins:gfs01: Exiting gulm_mount with errors -512 GFS: can't mount proto = lock_gulm, table = hopkins:gfs01, hostdata = So, does that shed some light onto things? Something specific to SMP and lock_gulm. It still doesn't work in UP mode, but it does not oops.