This series presents one of the previously discussed approaches to re-enable HugeTLB Vmemmap Optimization (HVO) on arm64. HVO was disabled by commit 060a2c92d1b6 ("arm64: mm: hugetlb: Disable HUGETLB_PAGE_OPTIMIZE_VMEMMAP") due to the following reason: This is deemed UNPREDICTABLE by the Arm architecture without a break-before-make sequence (make the PTE invalid, TLBI, write the new valid PTE). However, such sequence is not possible since the vmemmap may be concurrently accessed by the kernel. Other approaches that have been discussed include: A. Handle kernel PF while doing BBM [1], B. Use stop_machine() while doing BBM [2], and, C. Enable FEAT_BBM level 2 and keep the memory contents at the old and new output addresses unchanged to avoid BBM (D8.16.1-2) [3]. A quick comparison between this approach (D) and the above approaches: --+------------------------------+-----------------------------+ | Pro | Con | --+------------------------------+-----------------------------+ A | Low latency, h/w independent | Predictability concerns [4] | B | Predictable, h/w independent | High latency | C | Predictable, low latency | H/w dependent, complex | D | Predictable, h/w independent | Medium latency | --+------------------------------+-----------------------------+ This approach is being tested for Google's production systems, which generally find the "con" above acceptable, making it the preferred tradeoff for our use cases: +------------------------------+------------+----------+--------+ | HugeTLB operations | Before [0] + After | Change | +------------------------------+------------+----------+--------+ | Alloc 600 1GB | 0m3.526s | 0m3.779s | +7% | | Free 600 1GB | 0m0.880s | 0m0.940s | +7% | | Demote 600 1GB to 307200 2MB | 0m1.575s | 0m5.132s | +326% | | Free 307200 2MB | 0m0.946s | 0m4.456s | +471% | +------------------------------+------------+----------+--------+ [0] For comparison purposes, this only includes the last patch in the series, i.e., CONFIG_ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP=y. [1] https://lore.kernel.org/20240113094436.2506396-1-sunnanyong@xxxxxxxxxx/ [2] https://lore.kernel.org/ZbKjHHeEdFYY1xR5@xxxxxxx/ [3] https://lore.kernel.org/Zo68DP6siXfb6ZBR@xxxxxxx/ [4] https://lore.kernel.org/20240326125409.GA9552@willie-the-truck/ Yu Zhao (6): mm/hugetlb_vmemmap: batch update PTEs mm/hugetlb_vmemmap: add arch-independent helpers irqchip/gic-v3: support SGI broadcast arm64: broadcast IPIs to pause remote CPUs arm64: pause remote CPUs to update vmemmap arm64: select ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP arch/arm64/Kconfig | 1 + arch/arm64/include/asm/pgalloc.h | 69 ++++++++ arch/arm64/include/asm/smp.h | 3 + arch/arm64/kernel/smp.c | 92 ++++++++++- drivers/irqchip/irq-gic-v3.c | 20 ++- include/linux/mm_types.h | 7 + mm/hugetlb_vmemmap.c | 262 +++++++++++++++++++++---------- 7 files changed, 360 insertions(+), 94 deletions(-) base-commit: 42f7652d3eb527d03665b09edac47f85fb600924 -- 2.47.0.rc1.288.g06298d1525-goog