> -----Original Message----- > From: Mike Kravetz [mailto:mike.kravetz@xxxxxxxxxx] > Sent: Wednesday, September 9, 2020 6:29 AM > To: Song Bao Hua (Barry Song) <song.bao.hua@xxxxxxxxxxxxx>; > linux-mm@xxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; > linux-arm-kernel@xxxxxxxxxxxxxxxxxxx; linux-mips@xxxxxxxxxxxxxxx > Cc: Roman Gushchin <guro@xxxxxx>; Joonsoo Kim <js1304@xxxxxxxxx>; Rik > van Riel <riel@xxxxxxxxxxx>; Aslan Bakirov <aslan@xxxxxx>; Michal Hocko > <mhocko@xxxxxxxxxx>; Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> > Subject: Re: [RFC PATCH] cma: make number of CMA areas dynamic, remove > CONFIG_CMA_AREAS > > On 9/3/20 6:58 PM, Song Bao Hua (Barry Song) wrote: > > > >> -----Original Message----- > >> From: Mike Kravetz [mailto:mike.kravetz@xxxxxxxxxx] > >> Sent: Thursday, September 3, 2020 3:02 PM > >> To: linux-mm@xxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; > >> linux-arm-kernel@xxxxxxxxxxxxxxxxxxx; linux-mips@xxxxxxxxxxxxxxx > >> Cc: Roman Gushchin <guro@xxxxxx>; Song Bao Hua (Barry Song) > >> <song.bao.hua@xxxxxxxxxxxxx>; Joonsoo Kim <js1304@xxxxxxxxx>; Rik van > >> Riel <riel@xxxxxxxxxxx>; Aslan Bakirov <aslan@xxxxxx>; Michal Hocko > >> <mhocko@xxxxxxxxxx>; Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>; > Mike > >> Kravetz <mike.kravetz@xxxxxxxxxx> > >> Subject: [RFC PATCH] cma: make number of CMA areas dynamic, remove > >> CONFIG_CMA_AREAS > >> > >> The number of distinct CMA areas is limited by the constant > >> CONFIG_CMA_AREAS. In most environments, this was set to a default > >> value of 7. Not too long ago, support was added to allocate hugetlb > >> gigantic pages from CMA. More recent changes to make > dma_alloc_coherent > >> NUMA-aware on arm64 added more potential users of CMA areas. Along > >> with the dma_alloc_coherent changes, the default value of CMA_AREAS > >> was bumped up to 19 if NUMA is enabled. > >> > >> It seems that the number of CMA users is likely to grow. Instead of > >> using a static array for cma areas, use a simple linked list. These > >> areas are used before normal memory allocators, so use the memblock > >> allocator. > > > > Hello Mike, It seems it is a good idea. Thanks for addressing this. > > > > I was focusing on per-numa cma feature in my patchset and I didn't take care > of this > > while I thought we should do something for the number of cma areas. > > > > Thanks for taking a look. > > One area where I could use some help is testing/verifying on arm. See the > changes to arch/arm/mm/dma-mapping.c. I have tested the generic changes > on > my x86 platform, but do not have an arm platform for easy testing. > > >> void __init dma_contiguous_early_fixup(phys_addr_t base, unsigned long > >> size) > >> { > >> - dma_mmu_remap[dma_mmu_remap_num].base = base; > >> - dma_mmu_remap[dma_mmu_remap_num].size = size; > >> - dma_mmu_remap_num++; > >> + struct dma_contig_early_reserve *d; > >> + > >> + d = memblock_alloc(sizeof(struct dma_contig_early_reserve), > > > > sizeof(*d)? > > Yes. thanks. > > >> @@ -172,15 +173,14 @@ int __init cma_init_reserved_mem(phys_addr_t > >> base, phys_addr_t size, > >> struct cma *cma; > >> phys_addr_t alignment; > >> > >> - /* Sanity checks */ > >> - if (cma_area_count == ARRAY_SIZE(cma_areas)) { > >> - pr_err("Not enough slots for CMA reserved regions!\n"); > >> - return -ENOSPC; > >> - } > >> + /* Do not attempt allocations after memblock allocator is torn down */ > >> + if (slab_is_available()) > >> + return -EINVAL; > >> > >> if (!size || !memblock_is_region_reserved(base, size)) > >> return -EINVAL; > >> > >> + > > > > Is this empty line relevant? > > No, added by mistake. > > >> @@ -192,12 +192,17 @@ int __init cma_init_reserved_mem(phys_addr_t > >> base, phys_addr_t size, > >> if (ALIGN(base, alignment) != base || ALIGN(size, alignment) != size) > >> return -EINVAL; > >> > >> + cma = memblock_alloc(sizeof(struct cma), sizeof(long)); > > > > sizeof(*cma)? > > Yes, thanks. > > > It seems we are going to write cma-> count, order_per_bit, debugfs fields. > > To avoid false sharing of the cacheline of struct cma, it is better to align with > > SMP_CACHE_BYTES. > > > > On the other hand, it seems we are unlikely to write the cma > > I thought about using SMP_CACHE_BYTES, but the structures are simply > defined > as an array today. This should not be any worse. I do not believe access > to the structures is performance sensitive. That depends on how often people will write and read the cma structure indirectly via dma_alloc/free_coherent() APIs, especially through multiple CPUs. Anyway, we don't have benchmark data to check if this will be really a problem. So I am ok with the code we use either SMP_CACHE_BYTES or long as the alignment. > > Thanks, > -- > Mike Kravetz Thanks Barry