On 14.08.20 19:31, Minchan Kim wrote: > There is a need for special HW to require bulk allocation of > high-order pages. For example, 4800 * order-4 pages. > > To meet the requirement, a option is using CMA area because > page allocator with compaction under memory pressure is > easily failed to meet the requirement and too slow for 4800 > times. However, CMA has also the following drawbacks: > > * 4800 of order-4 * cma_alloc is too slow > > To avoid the slowness, we could try to allocate 300M contiguous > memory once and then split them into order-4 chunks. > The problem of this approach is CMA allocation fails one of the > pages in those range couldn't migrate out, which happens easily > with fs write under memory pressure. Why not chose a value in between? Like try to allocate MAX_ORDER - 1 chunks and split them. That would already heavily reduce the call frequency. I don't see a real need for a completely new range allocator function for this special case yet. > > To solve issues, this patch introduces alloc_pages_bulk. > > int alloc_pages_bulk(unsigned long start, unsigned long end, > unsigned int migratetype, gfp_t gfp_mask, > unsigned int order, unsigned int nr_elem, > struct page **pages); > > It will investigate the [start, end) and migrate movable pages > out there by best effort(by upcoming patches) to make requested > order's free pages. > > The allocated pages will be returned using pages parameter. > Return value represents how many of requested order pages we got. > It could be less than user requested by nr_elem. > > /** > * alloc_pages_bulk() -- tries to allocate high order pages > * by batch from given range [start, end) > * @start: start PFN to allocate > * @end: one-past-the-last PFN to allocate > * @migratetype: migratetype of the underlaying pageblocks (either > * #MIGRATE_MOVABLE or #MIGRATE_CMA). All pageblocks > * in range must have the same migratetype and it must > * be either of the two. > * @gfp_mask: GFP mask to use during compaction > * @order: page order requested > * @nr_elem: the number of high-order pages to allocate > * @pages: page array pointer to store allocated pages (must > * have space for at least nr_elem elements) > * > * The PFN range does not have to be pageblock or MAX_ORDER_NR_PAGES > * aligned. The PFN range must belong to a single zone. > * > * Return: the number of pages allocated on success or negative error code. > * The allocated pages should be freed using __free_pages > */ > > The test goes order-4 * 4800 allocation(i.e., total 300MB) under kernel > build workload. System RAM size is 1.5GB and CMA is 500M. > > With using CMA to allocate to 300M, ran 10 times trial, 10 time failed > with big latency(up to several seconds). > > With this alloc_pages_bulk API, ran 10 time trial, 7 times are > successful to allocate 4800 times. Rest 3 times are allocated 4799, 4789 > and 4799. They are all done with 300ms. > > This patchset is against on next-20200813 > > Minchan Kim (7): > mm: page_owner: split page by order > mm: introduce split_page_by_order > mm: compaction: deal with upcoming high-order page splitting > mm: factor __alloc_contig_range out > mm: introduce alloc_pages_bulk API > mm: make alloc_pages_bulk best effort > mm/page_isolation: avoid drain_all_pages for alloc_pages_bulk > > include/linux/gfp.h | 5 + > include/linux/mm.h | 2 + > include/linux/page-isolation.h | 1 + > include/linux/page_owner.h | 10 +- > mm/compaction.c | 64 +++++++---- > mm/huge_memory.c | 2 +- > mm/internal.h | 5 +- > mm/page_alloc.c | 198 ++++++++++++++++++++++++++------- > mm/page_isolation.c | 10 +- > mm/page_owner.c | 7 +- > 10 files changed, 230 insertions(+), 74 deletions(-) > -- Thanks, David / dhildenb