On Mon, Mar 08, 2010 at 03:49:01PM +0100, Andrea Arcangeli wrote: > On Mon, Mar 08, 2010 at 03:32:19PM +0200, Avi Kivity wrote: > > It looks unrelated to kvm, though of course random memory corruption > > cannot be ruled out. > > > > Is npt enabled on the host (cat /sys/module/kvm_amd/parameters/npt)? > > > > Andrea, any idea? > > Basically find_vma(vma->vm_mm, vma->vm_start) doesn't return "vma" > despite "vma" is the one with the smaller vm_end where the comparison > "vma->vm_start < vma->vm_end" is true (the next vma is null and the > prev will have vma->vm_start == prev->vm_end, not <). > > The bug check looks right, it doesn't seem false positive and this > bugcheck indicates that the vma rbtree is memory-corrupted somehow. > > so yes fiddling with npt on and off sounds a good start, if it's a bug I can confirm it happens with npt on and off. And it also happens on a Nehalem XEON (it just happened). > in shadow paging it's unlikely the exact same bug materializes with > both npt and without. If the crash happens with npt on and off, then > maybe it's not hypervisor related. Could also be bad RAM if it only I doubt it is bad ram! This machine is working (wihtout KVM) for almost 2 years and MCE does not report any problems on the host machine. And it happens on two identical machines (Opteron) and now o the new (5 days old) Intel Nehalem XEON. All guest are Running the same kernel. It happens with a kernel compiled by me and from debian SID both 2.6.32.9, and from previous kernel I tried (2.6.31.12 and 2.6.27.45) > happens on a single host and all other hosts are fine with same binary > guest/host kernels (rbtree walk might stress the memory bus more than > other operations). Said that vm_next being null (and if it's null, > likely vm_next pointer has no ram bitflip) is a bit weird and not > common scenario and this page fault seems triggered with procfs > copy_user call which is non standard, so maybe this is a guest bug. It > would be interesting to know what is the vm_start address, at the end > there are stack, vdso and vsyscall areas. I'll make it print vm_start for next reboot. -- Bruno Ribas - ribas@xxxxxxxxxxxx http://www.inf.ufpr.br/ribas C3SL: http://www.c3sl.ufpr.br -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html