On 9/16/20 2:14 AM, Song Bao Hua (Barry Song) wrote: >>> -----Original Message----- >>> From: Mike Kravetz [mailto:mike.kravetz@xxxxxxxxxx] >>> Sent: Wednesday, September 16, 2020 8:57 AM >>> To: linux-mm@xxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; >>> linux-arm-kernel@xxxxxxxxxxxxxxxxxxx; linux-mips@xxxxxxxxxxxxxxx >>> Cc: Roman Gushchin <guro@xxxxxx>; Song Bao Hua (Barry Song) >>> <song.bao.hua@xxxxxxxxxxxxx>; Mike Rapoport <rppt@xxxxxxxxxx>; Joonsoo >>> Kim <js1304@xxxxxxxxx>; Rik van Riel <riel@xxxxxxxxxxx>; Aslan Bakirov >>> <aslan@xxxxxx>; Michal Hocko <mhocko@xxxxxxxxxx>; Andrew Morton >>> <akpm@xxxxxxxxxxxxxxxxxxxx>; Mike Kravetz <mike.kravetz@xxxxxxxxxx> >>> Subject: [PATCH] cma: make number of CMA areas dynamic, remove >>> CONFIG_CMA_AREAS >>> >>> The number of distinct CMA areas is limited by the constant >>> CONFIG_CMA_AREAS. In most environments, this was set to a default >>> value of 7. Not too long ago, support was added to allocate hugetlb >>> gigantic pages from CMA. More recent changes to make >> dma_alloc_coherent >>> NUMA-aware on arm64 added more potential users of CMA areas. Along >>> with the dma_alloc_coherent changes, the default value of CMA_AREAS >>> was bumped up to 19 if NUMA is enabled. >>> >>> It seems that the number of CMA users is likely to grow. Instead of >>> using a static array for cma areas, use a simple linked list. These >>> areas are used before normal memory allocators, so use the memblock >>> allocator. >>> >>> Acked-by: Roman Gushchin <guro@xxxxxx> >>> Signed-off-by: Mike Kravetz <mike.kravetz@xxxxxxxxxx> >>> --- >>> rfc->v1 >>> - Made minor changes suggested by Song Bao Hua (Barry Song) >>> - Removed check for late calls to cma_init_reserved_mem that was part >>> of RFC. >>> - Added ACK from Roman Gushchin >>> - Still in need of arm testing >> >> Unfortunately, the test result on my arm64 board is negative, Linux can't boot >> after applying >> this patch. >> >> I guess we have to hold on this patch for a while till this is fixed. BTW, Mike, do >> you have >> a qemu-based arm64 numa system to debug? It is very easy to reproduce, we >> don't need to >> use hugetlb_cma and pernuma_cma. Just the default cma will make the boot >> hang. > > Hi Mike, > I spent some time on debugging the boot issue and sent a patch here: > https://lore.kernel.org/linux-mm/20200916085933.25220-1-song.bao.hua@xxxxxxxxxxxxx/ > All details and knic oops can be found there. > pls feel free to merge my patch into your v2 if you want. And we probably need ack from > arm maintainers. > > Also, +Will, > > Hi Will, the whole story is that Mike tried to remove the cma array with CONFIG_CMA_AREAS > and moved to use memblock_alloc() to allocate cma area, so that the number of cma areas > could be dynamic. It turns out it causes a kernel panic on arm64 during system boot as the > returned address from memblock_alloc is invalid before paging_init() is done on arm64. > Thank you! Based on your analysis, I am concerned that other architectures may also have issues. Andrew, I suggest we remove this patch from your tree. I will audit all architectures which enable CMA and look for similar issues there. Will then merge Barry's patch into a V2 with any other arch specific changes. -- Mike Kravetz