On Sat, Jan 18, 2025 at 12:00 PM Juan Yescas <jyescas@xxxxxxxxxx> wrote: > > + iamjoonsoo.kim@xxxxxxx > + quic_charante@xxxxxxxxxxx > > On Fri, Jan 17, 2025 at 2:52 PM Juan Yescas <jyescas@xxxxxxxxxx> wrote: > > > > +Suren Baghdasaryan > > +Kalesh Singh > > +T.J. Mercier > > +Isaac Manjarres > > > > On Fri, Jan 17, 2025 at 2:51 PM Juan Yescas <jyescas@xxxxxxxxxx> wrote: > > > > > > Hi Linux memory team > > > > > > When the drivers reserve CMA memory in 16KiB kernels, the minimum > > > alignment is 32 MiB as per CMA_MIN_ALIGNMENT_BYTES. However, in 4KiB > > > kernels, the CMA alignment is 4MiB. > > > > > > This is forcing the drivers to reserve more memory in 16KiB kernels, > > > even if they only require 4MiB or 8MiB. > > > > > > reserved-memory { > > > #address-cells = <2>; > > > #size-cells = <2>; > > > ranges; > > > tpu_cma_reserve: tpu_cma_reserve { > > > compatible = "shared-dma-pool"; > > > reusable; > > > size = <0x0 0x2000000>; /* 32 MiB */ > > > } > > > > > > One workaround to continue using 4MiB alignment is: > > > > > > - Disable CONFIG_TRANSPARENT_HUGEPAGE so the buddy allocator does NOT > > > have to allocate huge pages (32 MiB in 16KiB page sizes) > > > - Set ARCH_FORCE_MAX_ORDER for ARM64_16K_PAGES to "8", instead of > > > "11", so CMA_MIN_ALIGNMENT_BYTES is equals to 4 MiB > > > > > > config ARCH_FORCE_MAX_ORDER > > > int > > > default "13" if ARM64_64K_PAGES > > > default "8" if ARM64_16K_PAGES > > > default "10" > > > > > > #define MAX_PAGE_ORDER CONFIG_ARCH_FORCE_MAX_ORDER // 8 > > > #define pageblock_order MAX_PAGE_ORDER // 8 > > > #define pageblock_nr_pages (1UL << pageblock_order) // 256 > > > #define CMA_MIN_ALIGNMENT_PAGES pageblock_nr_pages // 256 > > > #define CMA_MIN_ALIGNMENT_BYTES (PAGE_SIZE * CMA_MIN_ALIGNMENT_PAGES) > > > // 16384 * 256 = 4194304 = 4 MiB > > > > > > After compiling the kernel with this changes, the kernel boots without > > > warnings and the memory is reserved: > > > > > > [ 0.000000] Reserved memory: created CMA memory pool at > > > 0x000000007f800000, size 8 MiB > > > [ 0.000000] OF: reserved mem: initialized node tpu_cma_reserve, > > > compatible id shared-dma-pool > > > [ 0.000000] OF: reserved mem: > > > 0x000000007f800000..0x000000007fffffff (8192 KiB) map reusable > > > tpu_cma_reserve > > > > > > # uname -a > > > Linux buildroot 6.12.9-dirty > > > # zcat /proc/config.gz | grep ARM64_16K > > > CONFIG_ARM64_16K_PAGES=y > > > # zcat /proc/config.gz | grep TRANSPARENT_HUGE > > > CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y > > > # CONFIG_TRANSPARENT_HUGEPAGE is not set > > > # cat /proc/pagetypeinfo > > > Page block order: 8 > > > Pages per block: 256 > > > > > > Free pages count per migrate type at order 0 1 2 > > > 3 4 5 6 7 8 > > > Node 0, zone DMA, type Unmovable 1 1 13 > > > 6 5 2 0 0 1 > > > Node 0, zone DMA, type Movable 9 16 19 > > > 13 13 5 2 0 182 > > > Node 0, zone DMA, type Reclaimable 0 1 0 > > > 1 1 0 0 1 0 > > > Node 0, zone DMA, type HighAtomic 0 0 0 > > > 0 0 0 0 0 0 > > > Node 0, zone DMA, type CMA 1 0 0 > > > 0 0 0 0 0 49 > > > Node 0, zone DMA, type Isolate 0 0 0 > > > 0 0 0 0 0 0 > > > Number of blocks type Unmovable Movable Reclaimable > > > HighAtomic CMA Isolate > > > Node 0, zone DMA 6 199 1 > > > 0 50 0 > > > > > > > > > However, with this workaround, we can't use transparent huge pages. I don’t think this is accurate. You can still use mTHP with a size equal to or smaller than 4MiB, right? By the way, what specific regression have you observed when reserving a larger size like 32MB? For CMA, the over-reserved memory is still available to the system for movable folios. 28MiB doesn’t seem significant enough to cause a noticeable regression, does it? > > > > > > Is the CMA_MIN_ALIGNMENT_BYTES requirement alignment only to support huge pages? > > > Is there another option to reduce the CMA_MIN_ALIGNMENT_BYTES alignment? > > > > > > Thanks > > > Juan Thanks barry