On Tue, Dec 15, 2020 at 01:43:41PM -0800, Roman Gushchin wrote: > On Tue, Dec 15, 2020 at 11:36:23PM +0200, Mike Rapoport wrote: > > Hi Roman, > > > > On Tue, Dec 15, 2020 at 11:36:15AM -0800, Roman Gushchin wrote: > > > Currently cma areas without a fixed base address are allocated > > > close to the end of the node. This placement is sub-optimal because > > > of how the compaction works: it effectively moves pages into > > > the cma area. In particular, it often brings in hot executable pages, > > > even if there is a plenty of free memory on the machine. > > > This results in more cma allocation failures. > > > > > > Instead let's place cma areas close to the beginning of a node. > > > Cma first tries to start with highmem_start, so we shouldn't mess > > > up with DMA32. In this case the compaction will help to free cma > > > areas, resulting in better cma allocation success rates. > > > > > > Signed-off-by: Roman Gushchin <guro@xxxxxx> > > > --- > > > include/linux/memblock.h | 5 +++-- > > > mm/cma.c | 4 ++-- > > > mm/memblock.c | 26 +++++++++++++++----------- > > > 3 files changed, 20 insertions(+), 15 deletions(-) > > > > > > diff --git a/include/linux/memblock.h b/include/linux/memblock.h > > > index 9c5cc95c7cee..698188066450 100644 > > > --- a/include/linux/memblock.h > > > +++ b/include/linux/memblock.h > > > @@ -384,8 +384,9 @@ static inline int memblock_get_region_node(const struct memblock_region *r) > > > phys_addr_t memblock_phys_alloc_range(phys_addr_t size, phys_addr_t align, > > > phys_addr_t start, phys_addr_t end); > > > phys_addr_t memblock_alloc_range_nid(phys_addr_t size, > > > - phys_addr_t align, phys_addr_t start, > > > - phys_addr_t end, int nid, bool exact_nid); > > > + phys_addr_t align, phys_addr_t start, > > > + phys_addr_t end, int nid, bool exact_nid, > > > + bool bottom_up); > > > phys_addr_t memblock_phys_alloc_try_nid(phys_addr_t size, phys_addr_t align, int nid); > > > > > > static inline phys_addr_t memblock_phys_alloc(phys_addr_t size, > > > diff --git a/mm/cma.c b/mm/cma.c > > > index 20c4f6f40037..1b42be6d059b 100644 > > > --- a/mm/cma.c > > > +++ b/mm/cma.c > > > @@ -332,13 +332,13 @@ int __init cma_declare_contiguous_nid(phys_addr_t base, > > > */ > > > if (base < highmem_start && limit > highmem_start) { > > > addr = memblock_alloc_range_nid(size, alignment, > > > - highmem_start, limit, nid, true); > > > + highmem_start, limit, nid, true, true); > > > limit = highmem_start; > > > } > > > > > > if (!addr) { > > > addr = memblock_alloc_range_nid(size, alignment, base, > > > - limit, nid, true); > > > + limit, nid, true, true); > > > if (!addr) { > > > ret = -ENOMEM; > > > goto err; > > > diff --git a/mm/memblock.c b/mm/memblock.c > > > index b8b7be0561c4..c334b401fe16 100644 > > > --- a/mm/memblock.c > > > +++ b/mm/memblock.c > > > @@ -272,6 +272,7 @@ __memblock_find_range_top_down(phys_addr_t start, phys_addr_t end, > > > * %MEMBLOCK_ALLOC_ACCESSIBLE > > > * @nid: nid of the free area to find, %NUMA_NO_NODE for any node > > > * @flags: pick from blocks based on memory attributes > > > + * @bottom_up: force bottom-up allocation > > > > Why wouldn't you use memblock_set_bottom_up() around the allocations in > > CMA, e.g. > > > > bool bottom_up = memblock_bottom_up(); > > > > if (!bottom_up) > > memblock_set_bottom_up(true); > > > > /* allocate memory */ > > > > memblock_set_bottom_up(bottom_up); > > Hi Mike! > > Wouldn't it open a possibility for a race? If somebody else is doing an allocation > in parallel, their allocation could become affected. This happens a lot earlier than we can have concurrency, so there is no such possibility. > Thanks! > -- Sincerely yours, Mike.