The first version can be found at: https://lkml.org/lkml/2020/4/12/42 Zero out the page content usually happens when allocating pages with the flag of __GFP_ZERO, this is a time consuming operation, it makes the population of a large vma area very slowly. This patch introduce a new feature for zero out pages before page allocation, it can help to speed up page allocation with __GFP_ZERO. My original intention for adding this feature is to shorten VM creation time when SR-IOV devicde is attached, it works good and the VM creation time is reduced by about 90%. Creating a VM [64G RAM, 32 CPUs] with GPU passthrough ===================================================== QEMU use 4K pages, THP is off round1 round2 round3 w/o this patch: 23.5s 24.7s 24.6s w/ this patch: 10.2s 10.3s 11.2s QEMU use 4K pages, THP is on round1 round2 round3 w/o this patch: 17.9s 14.8s 14.9s w/ this patch: 1.9s 1.8s 1.9s ===================================================== Obviously, it can do more than this. We can benefit from this feature in the flowing case: Interactive sence ================= Shorten application lunch time on desktop or mobile phone, it can help to improve the user experience. Test shows on a server [Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz], zero out 1GB RAM by the kernel will take about 200ms, while some mainly used application like Firefox browser, Office will consume 100 ~ 300 MB RAM just after launch, by pre zero out free pages, it means the application launch time will be reduced about 20~60ms (can be visual sensed?). May be we can make use of this feature to speed up the launch of Andorid APP (I didn't do any test for Android). Virtulization ============= Speed up VM creation and shorten guest boot time, especially for PCI SR-IOV device passthrough scenario. Compared with some of the para vitalization solutions, it is easy to deploy because it’s transparent to guest and can handle DMA properly in BIOS stage, while the para virtualization solution can’t handle it well. Improve guest performance when use VIRTIO_BALLOON_F_REPORTING for memory overcommit. The VIRTIO_BALLOON_F_REPORTING feature will report guest page to the VMM, VMM will unmap the corresponding host page for reclaim, when guest allocate a page just reclaimed, host will allocate a new page and zero it out for guest, in this case pre zero out free page will help to speed up the proccess of fault in and reduce the performance impaction. Speed up kernel routine ======================= This can’t be guaranteed because we don’t pre zero out all the free pages, but is true for most case. It can help to speed up some important system call just like fork, which will allocate zero pages for building page table. And speed up the process of page fault, especially for huge page fault. The POC of Hugetlb free page pre zero out has been done. Security ======== This is a weak version of "introduce init_on_alloc=1 and init_on_free=1 boot options", which zero out page in a asynchronous way. For users can't tolerate the impaction of 'init_on_alloc=1' or 'init_on_free=1' brings, this feauture provide another choice. For the feedback of the first version, cache pollution is the main concern of the mm guys, On the other hand, this feature is really helpful for some use case. May be we should let the user decide wether to use it. So a switch is added in the /sys files, users who don’t like it can turn off the switch, or by configuring a large batch size to reduce cache pollution. To make the whole function works, support of pre zero out free huge pages should be added for hugetlbfs, I will send another patch for it. Liang Li (4): mm: let user decide page reporting option mm: pre zero out free pages to speed up page allocation for __GFP_ZERO mm: make page reporing worker works better for low order page mm: Add batch size for free page reporting drivers/virtio/virtio_balloon.c | 3 + include/linux/highmem.h | 31 +++- include/linux/page-flags.h | 16 +- include/linux/page_reporting.h | 3 + include/trace/events/mmflags.h | 7 + mm/Kconfig | 10 ++ mm/Makefile | 1 + mm/huge_memory.c | 3 +- mm/page_alloc.c | 4 + mm/page_prezero.c | 266 ++++++++++++++++++++++++++++++++ mm/page_prezero.h | 13 ++ mm/page_reporting.c | 49 +++++- mm/page_reporting.h | 16 +- 13 files changed, 405 insertions(+), 17 deletions(-) create mode 100644 mm/page_prezero.c create mode 100644 mm/page_prezero.h Cc: Alexander Duyck <alexander.h.duyck@xxxxxxxxxxxxxxx> Cc: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> Cc: Andrea Arcangeli <aarcange@xxxxxxxxxx> Cc: Dan Williams <dan.j.williams@xxxxxxxxx> Cc: Dave Hansen <dave.hansen@xxxxxxxxx> Cc: David Hildenbrand <david@xxxxxxxxxx> Cc: Michal Hocko <mhocko@xxxxxxxx> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> Cc: Alex Williamson <alex.williamson@xxxxxxxxxx> Cc: Michael S. Tsirkin <mst@xxxxxxxxxx> Signed-off-by: Liang Li <liliang324@xxxxxxxxx> -- 2.18.2