Re: [PATCH v9 0/6] optimize memblock_next_valid_pfn and early_pfn_valid on arm and arm64

Pavel Tatashin <pasha.tatashin@xxxxxxxxxx> · Mon, 2 Jul 2018 08:06:50 -0400

On Mon, Jul 2, 2018 at 7:40 AM Michal Hocko <mhocko@xxxxxxxxxx> wrote:
>
> On Fri 29-06-18 10:29:17, Jia He wrote:
> > Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
> > where possible") tried to optimize the loop in memmap_init_zone(). But
> > there is still some room for improvement.
>
> It would be great to shortly describe those optimization from high level
> POV.
>
> >
> > Patch 1 introduce new config to make codes more generic
> > Patch 2 remain the memblock_next_valid_pfn on arm and arm64
> > Patch 3 optimizes the memblock_next_valid_pfn()
> > Patch 4~6 optimizes the early_pfn_valid()
> >
> > As for the performance improvement, after this set, I can see the time
> > overhead of memmap_init() is reduced from 27956us to 13537us in my
> > armv8a server(QDF2400 with 96G memory, pagesize 64k).
>
> So this is 13ms saving when booting 96G machine. Is this really worth
> the additional code? Are there any other benefits?

While 0.0144s for 96G is definitely small, I think the time is
proportional to the number of pages since memmap_init() loops through
all the pages. If base pages were changed to 4K, I bet the time would
increase 16 times: 0.23s on given machine, in other words around 2s
per 1T of memory.

I agree, a high level description of optimization is needed, and also
an explanation of why it would not work on other arches that support
memblock.

Pavel