On Mon, Jul 2, 2018 at 7:40 AM Michal Hocko <mhocko@xxxxxxxxxx> wrote: > > On Fri 29-06-18 10:29:17, Jia He wrote: > > Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns > > where possible") tried to optimize the loop in memmap_init_zone(). But > > there is still some room for improvement. > > It would be great to shortly describe those optimization from high level > POV. > > > > > Patch 1 introduce new config to make codes more generic > > Patch 2 remain the memblock_next_valid_pfn on arm and arm64 > > Patch 3 optimizes the memblock_next_valid_pfn() > > Patch 4~6 optimizes the early_pfn_valid() > > > > As for the performance improvement, after this set, I can see the time > > overhead of memmap_init() is reduced from 27956us to 13537us in my > > armv8a server(QDF2400 with 96G memory, pagesize 64k). > > So this is 13ms saving when booting 96G machine. Is this really worth > the additional code? Are there any other benefits? While 0.0144s for 96G is definitely small, I think the time is proportional to the number of pages since memmap_init() loops through all the pages. If base pages were changed to 4K, I bet the time would increase 16 times: 0.23s on given machine, in other words around 2s per 1T of memory. I agree, a high level description of optimization is needed, and also an explanation of why it would not work on other arches that support memblock. Pavel