On Tue, 2019-01-08 at 21:04 +0100, Michal Hocko wrote: > On Tue 08-01-19 10:40:18, Alexander Duyck wrote: > > On Fri, 2019-01-04 at 10:31 +0530, Arun KS wrote: > > > When freeing pages are done with higher order, time spent on coalescing > > > pages by buddy allocator can be reduced. With section size of 256MB, hot > > > add latency of a single section shows improvement from 50-60 ms to less > > > than 1 ms, hence improving the hot add latency by 60 times. Modify > > > external providers of online callback to align with the change. > > > > > > Signed-off-by: Arun KS <arunks@xxxxxxxxxxxxxx> > > > Acked-by: Michal Hocko <mhocko@xxxxxxxx> > > > Reviewed-by: Oscar Salvador <osalvador@xxxxxxx> > > > > After running into my initial issue I actually had a few more questions > > about this patch. > > > > > [...] > > > +static int online_pages_blocks(unsigned long start, unsigned long nr_pages) > > > +{ > > > + unsigned long end = start + nr_pages; > > > + int order, ret, onlined_pages = 0; > > > + > > > + while (start < end) { > > > + order = min(MAX_ORDER - 1, > > > + get_order(PFN_PHYS(end) - PFN_PHYS(start))); > > > + > > > + ret = (*online_page_callback)(pfn_to_page(start), order); > > > + if (!ret) > > > + onlined_pages += (1UL << order); > > > + else if (ret > 0) > > > + onlined_pages += ret; > > > + > > > + start += (1UL << order); > > > + } > > > + return onlined_pages; > > > } > > > > > > > Should the limit for this really be MAX_ORDER - 1 or should it be > > pageblock_order? In some cases this will be the same value, but I seem > > to recall that for x86 MAX_ORDER can be several times larger than > > pageblock_order. > > Does it make any difference when we are in fact trying to onine nr_pages > and we clamp to it properly? I'm not entirely sure if it does or not. What I notice looking through the code though is that there are a number of checks for the pageblock migrate type. There ends up being checks in __free_one_page, free_one_page, and __free_pages_ok all related to this. It might be moot since we are starting with a offline section, but I just brought this up because I know in the case of deferred page init we were limiting ourselves to pageblock_order and I wasn't sure if there was some specific reason for doing that. > > > static int online_pages_range(unsigned long start_pfn, unsigned long nr_pages, > > > void *arg) > > > { > > > - unsigned long i; > > > unsigned long onlined_pages = *(unsigned long *)arg; > > > - struct page *page; > > > > > > if (PageReserved(pfn_to_page(start_pfn))) > > > > I'm not sure we even really need this check. Getting back to the > > discussion I have been having with Michal in regards to the need for > > the DAX pages to not have the reserved bit cleared I was originally > > wondering if we could replace this check with a call to > > online_section_nr since the section shouldn't be online until we set > > the bit below in online_mem_sections. > > > > However after doing some further digging it looks like this could > > probably be dropped entirely since we only call this function from > > online_pages and that function is only called by memory_block_action if > > pages_correctly_probed returns true. However pages_correctly_probed > > should return false if any of the sections contained in the page range > > is already online. > > Yes you are right but I guess it would be better to address in a > separate patch that deals with PageReserved manipulation in general. > I do not think we want to remove the check silently. People who might be > interested in backporting this for whatever reason might screatch their > head why the test is not needed anymore. Yeah I am already working on that, it is what led me to review this patch. Just thought I would bring it up since it would make it possible to essentially reduce the size and/or need for a new function.