This patch tries to shrink the size of the 64bit tlb handler and also fix an vmalloc bug at the same time. By combining the swapper_pg_dir and module_pg_dir, several checks in tlb handler, particularly build_get_pgd_vmalloc64, are not necessary. The reason they can be combined is that, the effective virtual address of vmalloc returned is at the bottom, and of module_alloc returned is at the top. In the normal case of 4KB page size: VMALLOC_START, VMALLOC_END 0xc0000000 00000000 - 0xc0000100 00000000 MODULE_START, MODULE_END 0xffffffff c0000000 - 0xffffffff +xxxxxxx Change it to: VMALLOC_START, VMALLOC_END 0xc0000000 00000000 - 0xc00000ff 00000000 MODULE_START, MODULE_END 0xffffffff c0000000 - 0xffffffff +xxxxxxx We use the least 40 bits to traverse the page table, the change makes it still one-to-one mapping without more checking. "+" is in the range of [c,d,e,f], so there even are big holes bewteen them. With this patch, the tlb refill handler only contains about 28 instructions, instead of the original 38. And this patch also fix a bug in vmalloc, which happens when its returned address is not covered by the first pgd. e.g. if we do two vmallocs, the first returned address is 0xc0000000 00000000, and the 2nd is 0xc0000000 40000000, vmalloc -> __vmalloc_node -> __vmalloc_area_node -> __vmalloc_area_node -> map_vm_area -> pgd_offset_k pgd_offset_k doesn't use the address to index the pgd, just return the first one: #define pgd_offset_k(address) \ ((address) >= MODULE_START ? module_pg_dir : pgd_offset(&init_mm, 0UL)) This is wrong, then the 2 addresses are mapped to the same pte. This bug doesn't happen because even in the 4KB page case, one pgd can cover 1GB size, and it looks like the system won't vmalloc so much area.