On Thu, 2009-08-06 at 19:03 +0530, Sachin Sant wrote: > Benjamin Herrenschmidt wrote: > > Thanks. Since it's a memory corruption (or seems to be) however, it's > > possible that the bisection will mislead you. IE. The culprit could be > > somewhere else, and the commit you'll find via bisection just happens to > > move things around in the kernel in such a way that the corruption hits > > that code path instead of another rarely used one. > > > > I would suggest using printk to print out the content of memory where > > the code appears to have been smashed at different stages during boot > > (maybe even in the initcalls loop in init/main.c) to try to point out > > what appears to be causing the corruption. > > > By the time machine is up and running the particular memory location > in question is already overwritten. So seems like the corruption occurs > during the boot. > > I added few printks in the initcall debug code patch. The o/p suggests > that by the time first initicall debug message is printed the code is > already corrupted. Further debug suggests, when start_kernel() is > called the code at address(0xc000000000600000) is already corrupted. > About 28 bytes of code starting from the above address is overwritten. > > I will try to add few more debug statements to find the place where > this corruption might me happening. Is it always the exact same pattern at the exact same address? Or does it change and if so how? cheers
Attachment:
signature.asc
Description: This is a digitally signed message part