Hi, playing with anonymous CONT hugetlb pages on aarch64, I stumbled over the following VM_BUG_ON: [ 124.770288] ------------[ cut here ]------------ [ 124.774899] kernel BUG at mm/mmu_gather.c:70! [ 124.779244] Internal error: Oops - BUG: 0 [#1] SMP [ 124.784022] Modules linked in: mlx4_ib ib_uverbs ib_core mlx4_en rfkill vfat fat acpi_ipmi joydev ipmi_ssif igb mlx4_core ipmi_devintf ipmi_msghandler cppc_cpufreq fuse zram ip_tables xfs uas usb_storage dwc3 ulpi ast udc_core i2c_algo_bit drm_vram_helper drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec drm_ttm_helper ttm crct10dif_ce drm ghash_ce sbsa_gwdt i2c_xgene_slimpro xgene_hwmon ahci_platform gpio_dwapb xhci_plat_hcd [ 124.823045] CPU: 16 PID: 1160 Comm: test Not tainted 5.16.11-200.fc35.aarch64 #1 [ 124.830428] Hardware name: Lenovo HR350A 7X35CTO1WW /HR350A , BIOS hve104r-1.15 02/26/2021 [ 124.840240] pstate: 40400005 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 124.847189] pc : __tlb_remove_page_size+0x88/0xe4 [ 124.851885] lr : __unmap_hugepage_range+0x260/0x504 [ 124.856751] sp : ffff80000f6f3ae0 [ 124.860053] x29: ffff80000f6f3ae0 x28: ffff00080b639d24 x27: ffff000802504080 [ 124.867176] x26: fffffc00210f8000 x25: 0000000000000000 x24: ffff80000a9e8750 [ 124.874299] x23: 0000ffff8da20000 x22: ffff000804f0c190 x21: 0000000000010000 [ 124.881423] x20: ffff80000f6f3cb0 x19: ffff80000f6f3cb0 x18: 0000000000000000 [ 124.888545] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 [ 124.895668] x14: 0000000000000000 x13: 0008000000000000 x12: 0008000000000080 [ 124.902791] x11: 0008000000000000 x10: 00f80008c3e00f43 x9 : ffff800008404e60 [ 124.909914] x8 : 0846000000000000 x7 : 0000000000000000 x6 : ffff80000a8a4000 [ 124.917038] x5 : 0000000000000040 x4 : 0000000000000000 x3 : 0000000000001000 [ 124.924161] x2 : 0000000000010000 x1 : fffffc00210f8000 x0 : 0000000000000000 [ 124.931284] Call trace: [ 124.933718] __tlb_remove_page_size+0x88/0xe4 [ 124.938062] __unmap_hugepage_range+0x260/0x504 [ 124.942580] __unmap_hugepage_range_final+0x24/0x40 [ 124.947445] unmap_single_vma+0x100/0x11c [ 124.951443] unmap_vmas+0x7c/0xf4 [ 124.954746] unmap_region+0xa4/0xf0 [ 124.958222] __do_munmap+0x1b8/0x50c [ 124.961785] __vm_munmap+0x74/0x120 [ 124.965261] __arm64_sys_munmap+0x40/0x54 [ 124.969257] invoke_syscall+0x50/0x120 [ 124.972995] el0_svc_common.constprop.0+0x4c/0x100 [ 124.977774] do_el0_svc+0x34/0xa0 [ 124.981077] el0_svc+0x30/0xd0 [ 124.984120] el0t_64_sync_handler+0xa4/0x130 [ 124.988377] el0t_64_sync+0x1a4/0x1a8 [ 124.992028] Code: b4000140 f9001660 29410402 17fffff4 (d4210000) [ 124.998109] ---[ end trace a74a76b89c9f2d88 ]--- [ 125.002900] ------------[ cut here ]------------ I'm running with 64k hugetlb on 4k aarch64. Reproducer: #define _GNU_SOURCE #include <string.h> #include <unistd.h> #include <sys/mman.h> #include <linux/memfd.h> void main(void) { const size_t size = 64*1024; unsigned long cur; char *area; int fd; fd = memfd_create("test", MFD_HUGETLB | MFD_HUGE_64KB); ftruncate(fd, size); area = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0); memset(area, 0, size); munmap(area, size); } I assume __unmap_hugepage_range() does a a) tlb_remove_huge_tlb_entry() -> for sz != PMD_SIZE and sz != PUD_SIZE, this calls __tlb_remove_tlb_entry() -> __tlb_remove_tlb_entry() is a NOP on aarch64. __tlb_adjust_range() isn't called. b) tlb_remove_page_size() -> __tlb_remove_page_size() runs into VM_BUG_ON(!tlb->end); Not sure if this is just "ok" and we don't have to adjust the range or if there is some tlb range adjustment missing. -- Thanks, David / dhildenb