Re: [PATCH v2 0/5] variable-order, large folios for anonymous memory

Yu Zhao <yuzhao@xxxxxxxxxx> · Tue, 4 Jul 2023 01:11:13 -0600

On Tue, Jul 4, 2023 at 12:22 AM Yin, Fengwei <fengwei.yin@xxxxxxxxx> wrote:
>
> On 7/4/2023 10:18 AM, Yu Zhao wrote:
> > On Mon, Jul 3, 2023 at 7:53 AM Ryan Roberts <ryan.roberts@xxxxxxx> wrote:
> >>
> >> Hi All,
> >>
> >> This is v2 of a series to implement variable order, large folios for anonymous
> >> memory. The objective of this is to improve performance by allocating larger
> >> chunks of memory during anonymous page faults. See [1] for background.
> >
> > Thanks for the quick response!
> >
> >> I've significantly reworked and simplified the patch set based on comments from
> >> Yu Zhao (thanks for all your feedback!). I've also renamed the feature to
> >> VARIABLE_THP, on Yu's advice.
> >>
> >> The last patch is for arm64 to explicitly override the default
> >> arch_wants_pte_order() and is intended as an example. If this series is accepted
> >> I suggest taking the first 4 patches through the mm tree and the arm64 change
> >> could be handled through the arm64 tree separately. Neither has any build
> >> dependency on the other.
> >>
> >> The one area where I haven't followed Yu's advice is in the determination of the
> >> size of folio to use. It was suggested that I have a single preferred large
> >> order, and if it doesn't fit in the VMA (due to exceeding VMA bounds, or there
> >> being existing overlapping populated PTEs, etc) then fallback immediately to
> >> order-0. It turned out that this approach caused a performance regression in the
> >> Speedometer benchmark.
> >
> > I suppose it's regression against the v1, not the unpatched kernel.
> From the performance data Ryan shared, it's against unpatched kernel:
>
> Speedometer 2.0:
>
> | kernel                         |   runs_per_min |
> |:-------------------------------|---------------:|
> | baseline-4k                    |           0.0% |
> | anonfolio-lkml-v1              |           0.7% |
> | anonfolio-lkml-v2-simple-order |          -0.9% |
> | anonfolio-lkml-v2              |           0.5% |

I see. Thanks.

A couple of questions:
1. Do we have a stddev?
2. Do we have a theory why it regressed?
Assuming no bugs, I don't see how a real regression could happen --
falling back to order-0 isn't different from the original behavior.
Ryan, could you `perf record` and `cat /proc/vmstat` and share them?