Re: [PATCH 1/4] mm: introduce cma_alloc_bulk API

David Hildenbrand <david@xxxxxxxxxx> · Mon, 23 Nov 2020 15:15:37 +0100

On 17.11.20 19:19, Minchan Kim wrote:
> There is a need for special HW to require bulk allocation of
> high-order pages. For example, 4800 * order-4 pages, which
> would be minimum, sometimes, it requires more.
> 
> To meet the requirement, a option reserves 300M CMA area and
> requests the whole 300M contiguous memory. However, it doesn't
> work if even one of those pages in the range is long-term pinned
> directly or indirectly. The other option is to ask higher-order
> size (e.g., 2M) than requested order(64K) repeatedly until driver
> could gather necessary amount of memory. Basically, this approach
> makes the allocation very slow due to cma_alloc's function
> slowness and it could be stuck on one of the pageblocks if it
> encounters unmigratable page.
> 
> To solve the issue, this patch introduces cma_alloc_bulk.
> 
> 	int cma_alloc_bulk(struct cma *cma, unsigned int align,
> 		gfp_t gfp_mask, unsigned int order, size_t nr_requests,
> 		struct page **page_array, size_t *nr_allocated);
> 
> Most parameters are same with cma_alloc but it additionally passes
> vector array to store allocated memory. What's different with cma_alloc
> is it will skip pageblocks without waiting/stopping if it has unmovable
> page so that API continues to scan other pageblocks to find requested
> order page.
> 
> cma_alloc_bulk is best effort approach in that it skips some pageblocks
> if they have unmovable pages unlike cma_alloc. It doesn't need to be
> perfect from the beginning at the cost of performance. Thus, the API
> takes gfp_t to support __GFP_NORETRY which is propagated into
> alloc_contig_page to avoid significat overhead functions to inrecase
> CMA allocation success ratio(e.g., migration retrial, PCP, LRU draining
> per pageblock) at the cost of less allocation success ratio.
> If the caller couldn't allocate enough pages with __GFP_NORETRY, they
> could call it without __GFP_NORETRY to increase success ratio this time
> if they are okay to expense the overhead for the success ratio.

I'm not a friend of connecting __GFP_NORETRY  to PCP and LRU draining.
Also, gfp flags apply mostly to compaction (e.g., how to allocate free
pages for migration), so this seems a little wrong.

Can we instead introduce

enum alloc_contig_mode {
	/*
	 * Normal mode:
	 *
	 * Retry page migration 5 times, ... TBD
	 *
	 */
	ALLOC_CONTIG_NORMAL = 0,
	/*
	 * Fast mode: e.g., used for bulk allocations.
         *
	 * Don't retry page migration if it fails, don't drain PCP
         * lists, don't drain LRU.
	 */
	ALLOC_CONTIG_FAST,
};

To be extended by ALLOC_CONTIG_HARD in the future to be used e.g., by
virtio-mem (disable PCP, retry a couple of times more often ) ...

-- 
Thanks,

David / dhildenb