On Wed, Jul 08, 2009 at 10:07:50AM -0700, David Daney wrote: > The resume() implementation octeon_switch.S examines the saved > cp0_status register. We were clobbering the entire pt_regs structure > in kernel threads leading to random crashes. > > When switching away from a kernel thread, the saved cp0_status is > examined and if bit 30 is set it is cleared and the CP2 state saved > into the pt_regs structure. Since the kernel thread stack overlaid > the pt_regs structure this resulted in a corrupt stack. When the > kthread with the corrupt stack was resumed, it could crash if it used > any of the data in the stack that was clobbered. > > We fix it by moving the kernel thread stack down so it doesn't overlay > pt_regs. > > Differences from v1: Don't adjust the sp by an additional 32 bytes, it > was not needed. Also fix up __KSTK_TOS and > task_pt_regs. Thanks for fixing and testing the issues I raised on IRC. Next I'm wonding what impact the uninitialized state of the stack frame we allocate may have. I think we're ok - but I need to stare at this for a few more minutes. Ralf