On Mon, Jul 31, 2017 at 10:37:24AM +0200, Michal Hocko wrote: > On Mon 31-07-17 16:23:26, ZhenweiPi wrote: > > On 07/31/2017 03:51 PM, Michal Hocko wrote: > > > > >On Mon 31-07-17 15:41:49, Wei Wang wrote: > > >>>On 07/31/2017 02:55 PM, Michal Hocko wrote: > > >>>> >On Mon 31-07-17 12:13:33, Wei Wang wrote: > > >>>>> >>Ballooned pages will be marked as MADV_DONTNEED by the hypervisor and > > >>>>> >>shouldn't be given to the host ksmd to scan. > > >>>> >Could you point me where this MADV_DONTNEED is done, please? > > >>> > > >>>Sure. It's done in the hypervisor when the balloon pages are received. > > >>> > > >>>Please see line 40 at > > >>>https://github.com/qemu/qemu/blob/master/hw/virtio/virtio-balloon.c > > >And one more thing. I am not familiar with ksm much. But how is > > >MADV_DONTNEED even helping? This madvise is not sticky - aka it will > > >unmap the range without leaving any note behind. AFAICS the only way > > >to have vma scanned is to have VM_MERGEABLE and that is an opt in: > > >See Documentation/vm/ksm.txt > > >" > > >KSM only operates on those areas of address space which an application > > >has advised to be likely candidates for merging, by using the madvise(2) > > >system call: int madvise(addr, length, MADV_MERGEABLE). > > >" > > > > > >So what exactly is going on here? The original patch looks highly > > >suspicious as well. If somebody wants to make that memory mergable then > > >the user of that memory should zero them out. > > > > Kernel starts a kthread named "ksmd". ksmd scans the VM_MERGEABLE > > memory, and merge the same pages.(same page means memcmp(page1, > > page2, PAGESIZE) == 0). > > > > Guest can not use ballooned pages, and these pages will not be accessed > > in a long time. Kswapd on host will swap these pages out and get more > > free memory. > > > > Rather than swapping, KSM has better performence. Presently pages in > > the balloon device have random value, they usually cannot be merged. > > So enqueue zero pages will resolve this problem. > > > > Because MADV_DONTNEED depends on host os capability and hypervisor capability, > > I prefer to enqueue zero pages to balloon device and made this patch. I think you should have hypervisor zero them out if it wants to then. Seems cleaner. > > So why exactly are we zeroying pages (and pay some cost for that) in > guest when we do not know what host actually does with them? I suspect this is some special hypervisor that somehow benefits from this patch. It should just use a feature bit for its special needs I think. Michal is also exactly right that patches like this should come with some performance numbers. I'll post a patch adding virtio lists for mm/balloon_compaction.c so that we notice when people tweak it like that. > -- > Michal Hocko > SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>