Re: [RFC v2 PATCH 0/4] speed up page allocation for __GFP_ZERO

Liang Li <liliang324@xxxxxxxxx> · Tue, 22 Dec 2020 22:42:13 +0800

> > =====================================================
> > QEMU use 4K pages, THP is off
> >                   round1      round2      round3
> > w/o this patch:    23.5s       24.7s       24.6s
> > w/ this patch:     10.2s       10.3s       11.2s
> >
> > QEMU use 4K pages, THP is on
> >                   round1      round2      round3
> > w/o this patch:    17.9s       14.8s       14.9s
> > w/ this patch:     1.9s        1.8s        1.9s
> > =====================================================
>
> The cost of zeroing pages has to be paid somewhere.  You've successfully
> moved it out of this path that you can measure.  So now you've put it
> somewhere that you're not measuring.  Why is this a win?

Win or not depends on its effect. For our case, it solves the issue that we
faced, so it can be thought as a win for us.
If others don't have the issue we faced, the result will be different,
maybe they
will be affected by the side effect of this feature. I think this is
your concern
behind the question. right? I will try to do more tests and provide more
benchmark performance data.

> > Speed up kernel routine
> > =======================
> > This can’t be guaranteed because we don’t pre zero out all the free pages,
> > but is true for most case. It can help to speed up some important system
> > call just like fork, which will allocate zero pages for building page
> > table. And speed up the process of page fault, especially for huge page
> > fault. The POC of Hugetlb free page pre zero out has been done.
>
> Try kernbench with and without your patch.

OK. Thanks for your suggestion!

Liang