Problem: ======= On arm64, block and section mapping is supported to build page tables. However, currently it enforces to take base page mapping for the whole linear mapping if CONFIG_ZONE_DMA or CONFIG_ZONE_DMA32 is enabled and crashkernel kernel parameter is set. This will cause longer time of the linear mapping process during bootup and severe performance degradation during running time. Root cause: ========== On arm64, crashkernel reservation relies on knowing the upper limit of low memory zone because it needs to reserve memory in the zone so that devices' DMA addressing in kdump kernel can be satisfied. However, the limit on arm64 is variant. And the upper limit can only be decided late till bootmem_init() is called. And we need to map the crashkernel region with base page granularity when doing linear mapping, because kdump needs to protect the crashkernel region via set_memory_valid(,0) after kdump kernel loading. However, arm64 doesn't support well on splitting the built block or section mapping due to some cpu reststriction [1]. And unfortunately, the linear mapping is done before bootmem_init(). To resolve the above conflict on arm64, the compromise is enforcing to take base page mapping for the entire linear mapping if crashkernel is set, and CONFIG_ZONE_DMA or CONFIG_ZONE_DMA32 is enabed. Hence performance is sacrificed. Solution: ========= To fix the problem, we should always take 4G as the crashkernel low memory end in case CONFIG_ZONE_DMA or CONFIG_ZONE_DMA32 is enabled. With this, we don't need to defer the crashkernel reservation till bootmem_init() is called to set the arm64_dma_phys_limit. As long as memblock init is done, we can conclude what is the upper limit of low memory zone. 1) both CONFIG_ZONE_DMA or CONFIG_ZONE_DMA32 are disabled or memblock_start_of_DRAM() > 4G limit = PHYS_ADDR_MAX+1 (Corner cases) 2) CONFIG_ZONE_DMA or CONFIG_ZONE_DMA32 are enabled: limit = 4G (generic case) Justification: ============== In fact, kdump kernel doesn't need to cover all peripherals' addressing bits. Only device taken as dump target need be taken care of and its addressing bits need be satified. Currently, there are two kinds of dumping, dumped to local storage disk or dumped through network card to remove storage server. It means only storage disk or netowrk card taken as dump target need be consider if their addressing bits are satisfied. For saving memory, we usually generate kdump specific initramfs including necessary kernel modules for dump target devices. All other unnecessary kernel modules are excluded and their correspondent devices won't be initialized during kdump kernel bootup. So far, only Raspberry Pi 4 has some peripherals whcih can only address 30 bits memory range as reported in [2]. Devices on all other arm64 systems can address 32bits memory range. So by enforcing to take 4G as the crashkernel low memory end, the only risk is if RPi4 owns storage disk or network card which can't address 32bits memory range because they could be set as dump target. Even if RPi4 truly has storage devices or network card which can only address 30 bits memory range, it should be a corner case. We can document it since crashkernel is more taken as a feature on server. Besides, RPi4 still can use crashkernel=xM@yM to sepcify a location for 32bits addressing if it really has that kind of storage device or network card and kdump is expected. [1] https://lore.kernel.org/all/YrIIJkhKWSuAqkCx@xxxxxxx/T/#u [2] [PATCH v6 0/4] Raspberry Pi 4 DMA addressing support https://lore.kernel.org/linux-arm-kernel/20190911182546.17094-1-nsaenzjulienne@xxxxxxx/T/ ====== Question to Nicolas: Hi Nicolas, In cover letter of [2] patchset, you told RPi4 has peripherals which can only address 30bits range. In below sentence, do you mean "the PCIe, V3D, GENET" can't address 32bit range, or they have wider view of address space the same as 40-bit DMA channels? I am confused about that. And the storage device or network card on RPi4 can address 32bit range or 32bit range, do we have document or do you happen to know that? """ The new Raspberry Pi 4 has up to 4GB of memory but most peripherals can only address the first GB: their DMA address range is 0xc0000000-0xfc000000 which is aliased to the first GB of physical memory 0x00000000-0x3c000000. Note that only some peripherals have these limitations: the PCIe, V3D, GENET, and 40-bit DMA channels have a wider view of the address space by virtue of being hooked up trough a second interconnect. """ Baoquan He (2): arm64, kdump: enforce to take 4G as the crashkernel low memory end arm64: remove unneed defer_reserve_crashkernel() and crash_mem_map arch/arm64/include/asm/memory.h | 5 ---- arch/arm64/mm/init.c | 24 ++++++++------- arch/arm64/mm/mmu.c | 53 ++++++++++++++------------------- 3 files changed, 36 insertions(+), 46 deletions(-) base-commit: 10d4879f9ef01cc6190fafe4257d06f375bab92c -- 2.34.1 _______________________________________________ kexec mailing list kexec@xxxxxxxxxxxxxxxxxxx http://lists.infradead.org/mailman/listinfo/kexec