On Mon, Dec 21, 2020 at 8:25 AM Liang Li <liliang.opensource@xxxxxxxxx> wrote: > > The first version can be found at: https://lkml.org/lkml/2020/4/12/42 > > Zero out the page content usually happens when allocating pages with > the flag of __GFP_ZERO, this is a time consuming operation, it makes > the population of a large vma area very slowly. This patch introduce > a new feature for zero out pages before page allocation, it can help > to speed up page allocation with __GFP_ZERO. > > My original intention for adding this feature is to shorten VM > creation time when SR-IOV devicde is attached, it works good and the > VM creation time is reduced by about 90%. > > Creating a VM [64G RAM, 32 CPUs] with GPU passthrough > ===================================================== > QEMU use 4K pages, THP is off > round1 round2 round3 > w/o this patch: 23.5s 24.7s 24.6s > w/ this patch: 10.2s 10.3s 11.2s > > QEMU use 4K pages, THP is on > round1 round2 round3 > w/o this patch: 17.9s 14.8s 14.9s > w/ this patch: 1.9s 1.8s 1.9s > ===================================================== > > Obviously, it can do more than this. We can benefit from this feature > in the flowing case: So I am not sure page reporting is the best thing to base this page zeroing setup on. The idea with page reporting is to essentially act as a leaky bucket and allow the guest to drop memory it isn't using slowly so if it needs to reinflate it won't clash with the applications that need memory. What you are doing here seems far more aggressive in that you are going down to low order pages and sleeping instead of rescheduling for the next time interval. Also I am not sure your SR-IOV creation time test is a good justification for this extra overhead. With your patches applied all you are doing is making use of the free time before the test to do the page zeroing instead of doing it during your test. As such your CPU overhead prior to running the test would be higher and you haven't captured that information. One thing I would be interested in seeing is what is the load this is adding when you are running simple memory allocation/free type tests on the system. For example it might be useful to see what the will-it-scale page_fault1 tests look like with this patch applied versus not applied. I suspect it would be adding some amount of overhead as you have to spend a ton of time scanning all the pages and that will be considerable overhead.