On Wed, Jun 15, 2011 at 11:20 PM, Alexander Graf <agraf@xxxxxxx> wrote: > > On 16.06.2011, at 07:59, Linus Torvalds wrote: >> >> r26 has the value 0xc00090026236bbb0, and that "90" byte in the middle >> there looks bogus. It's not a valid pointer any more, but if that "9" >> had been a zero, it would have been. > > Please see my reply to Ben here. Your reply to Ben seems to say that 0xc00000026236bbb0 wouldn't have been a valid address, because you don't have that much memory. But that's clearly not true. All the other registers have valid pointers in them, and the stack pointer (r1) is c000000262987cd0, for example. And that stack is clearly valid - if the kernel stack pointer was corrupted, you'd never have gotten as far as reporting the oops. So you may have only 8GB of RAM in that machine, but if so, there's some empty unmapped physical space. Because clearly your RAM is _not_ limited to being mapped to below 0xc000000200000000. To recap: I'm pretty sure the memory corruption is just the "90" byte. The rest of the pointer looks too much like a pointer to be otherwise. Whether that's due to a two-bit error (unlikely) or a wild byte write (or 16-bit write with zeroes) is hard to say. USUALLY when we have wild pointer errors, the corruption is more than just a few bits, but it could have been something that sets a few bits in software, and just sets them using a stale pointer. > Yup, so let's keep this documented for now. Actually, the more I think about it the more it looks like simple random memory corruption by someone else in the kernel - and that's basically impossible to track and will give completely different bugs next time around :(. We've had several bugs found by the pattern of the corruption, so I wouldn't say "impossible to track". Even if the next time ends up being a completely different oops (because the corruption happened in a totally different kind of data structure), it might be possible that there's that same "90" byte pattern, for example. But it needs more than one bug report to see what the pattern is. Usually it takes a _lot_ more.. Linus -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href