On Thu, May 12, 2016 at 10:20:09PM +0200, Marc Haber wrote: > Hi David, > > On Sat, Apr 23, 2016 at 07:52:46PM +0100, Dr. David Alan Gilbert wrote: > > Hmm, your problem does sound like bad hardware, but.... > > If you've got a nice reliable crash, can you try turning transparent huge pages > > off on the host; > > echo never > /sys/kernel/mm/transparent_hugepage/enabled > > I must have missed this hint in the middle of the "your hardware is > bad" avalance that came over me. > > I spent two weeks bisecting "good" kernels since during the repeated > reconfigurations, transparent huge pages got turned off in kernel > configuration. After running each kernel for 24 hours, I eventually > ended up with a working 4.5 kernel. The configuration diff was short, > showing transparent huge pages, and - finally - upon re-reading the > thread I found your hint. > > I have now the result that 4.5, 4.5.1 and 4.5.4 corrupt KVM guest > memory reliably in the first hour of running under disk load, causing > the VM to either drop dead in the water, or to read randomness from > disk. Rebooting fixes the VM. This happens as soon as transparent huge > pages are turned on in the host. > > Turning off transparent huge pages by echo never > > /sys/kernel/mm/transparent_hugepage/enabled fixes the issue even > without rebooting the host. Start up the VM again and it works just > fine. > > Is this an issue in (a) transparent huge pages, (b) KVM or (c) qemu? > Where should this issue be forwarded? Or do we just accept it and turn > transparent huge pages off? Could you test this: http://lkml.kernel.org/r/1463070742-18401-1-git-send-email-aarcange@xxxxxxxxxx ? -- Kirill A. Shutemov -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html