> > ===================================================== > > QEMU use 4K pages, THP is off > > round1 round2 round3 > > w/o this patch: 23.5s 24.7s 24.6s > > w/ this patch: 10.2s 10.3s 11.2s > > > > QEMU use 4K pages, THP is on > > round1 round2 round3 > > w/o this patch: 17.9s 14.8s 14.9s > > w/ this patch: 1.9s 1.8s 1.9s > > ===================================================== > > The cost of zeroing pages has to be paid somewhere. You've successfully > moved it out of this path that you can measure. So now you've put it > somewhere that you're not measuring. Why is this a win? Win or not depends on its effect. For our case, it solves the issue that we faced, so it can be thought as a win for us. If others don't have the issue we faced, the result will be different, maybe they will be affected by the side effect of this feature. I think this is your concern behind the question. right? I will try to do more tests and provide more benchmark performance data. > > Speed up kernel routine > > ======================= > > This can’t be guaranteed because we don’t pre zero out all the free pages, > > but is true for most case. It can help to speed up some important system > > call just like fork, which will allocate zero pages for building page > > table. And speed up the process of page fault, especially for huge page > > fault. The POC of Hugetlb free page pre zero out has been done. > > Try kernbench with and without your patch. OK. Thanks for your suggestion! Liang