RE: post-copy is broken?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> -----Original Message-----
> From: Andrea Arcangeli [mailto:aarcange@xxxxxxxxxx]
> Sent: Wednesday, April 27, 2016 10:48 PM
> To: Li, Liang Z
> Cc: Dr. David Alan Gilbert; Kirill A. Shutemov; kirill.shutemov@xxxxxxxxxxxxxxx;
> Amit Shah; qemu-devel@xxxxxxxxxx; quintela@xxxxxxxxxx; linux-
> mm@xxxxxxxxx
> Subject: Re: post-copy is broken?
> 
> Hello Liang,
> 
> On Mon, Apr 18, 2016 at 10:33:14AM +0000, Li, Liang Z wrote:
> > If the THP is disabled, no fails.
> > And your test was always passed, even when  real post-copy was failed.
> >
> > In my env, the output of
> > 'cat /sys/kernel/mm/transparent_hugepage/enabled'  is:
> >
> >  [always] ...
> >
> 
> Can you test the fix?
> https://marc.info/?l=linux-mm&m=146175869123580&w=2
> 
> This was not a breakage in userfaultfd nor in postcopy. userfaultfd had no
> bugs and is fully rock solid and with zero chances of generating undetected
> memory corruption like it was happening in v4.5.
> 
> As I suspected, the same problem would have happened with any THP
> pmd_trans_huge split (swapping/inflating-balloon etc..). Postcopy just
> makes it easier to reproduce the problem because it does a scattered
> MADV_DONTNEED on the destination qemu guest memory for the pages
> redirtied during the last precopy pass that run, or not transferred (to allow
> THP faults in destination qemu during precopy), just before starting the
> guest in the destination node.
> 
> Other reports of KVM memory corruption happening on v4.5 with THP
> enabled will also be taken care of by the above fix.
> 
> I hope I managed to fix this in time for v4.6 final (current is v4.6-rc5-69), so
> the only kernel where KVM must not be used with THP enabled will be v4.5.
> 
> On a side note, this MADV_DONTEED trigger reminded me as soon as the
> madvisev syscall is merged, loadvm_postcopy_ram_handle_discard should
> start using it to reduce the enter/exit kernel to just 1 (or a few madvisev in
> case we want to give a limit to the temporary buffer to avoid the risk of
> allocating too much temporary RAM for very large
> guests) to do the MADV_DONTNEED scattered zapping. Same thing in
> virtio_balloon_handle_output.
> 

I have test the patch, guest doesn't crash anymore, I think the issue is fixed. Thanks!

Liang
> Thanks,
> Andrea

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]