Changes v3 => v4: 1. Shorten CC list on 4/8, so it is not dropped by the mail list. Changes v2 => v3: 1. Fix issues reported by kernel test robot <lkp@xxxxxxxxx>. Changes v1 => v2: 1. Add WARN to set_vm_flush_reset_perms() on huge pages. (Rick Edgecombe) 2. Simplify select_bpf_prog_pack_size. (Rick Edgecombe) As of 5.18-rc6, x86_64 uses bpf_prog_pack on 4kB pages. This set contains a few followups: 1/8 - 3/8 fills unused part of bpf_prog_pack with illegal instructions. 4/8 - 5/8 enables bpf_prog_pack on 2MB pages. The primary goal of bpf_prog_pack is to reduce iTLB miss rate and reduce direct memory mapping fragmentation. This leads to non-trivial performance improvements. For our web service production benchmark, bpf_prog_pack on 4kB pages gives 0.5% to 0.7% more throughput than not using bpf_prog_pack. bpf_prog_pack on 2MB pages 0.6% to 0.9% more throughput than not using bpf_prog_pack. Note that 0.5% is a huge improvement for our fleet. I believe this is also significant for other companies with many thousand servers. bpf_prog_pack on 2MB pages may use slightly more memory for systems without many BPF programs. However, such waste in memory (<2MB) is within noisy for modern x86_64 systems. Song Liu (8): bpf: fill new bpf_prog_pack with illegal instructions x86/alternative: introduce text_poke_set bpf: introduce bpf_arch_text_invalidate for bpf_prog_pack module: introduce module_alloc_huge bpf: use module_alloc_huge for bpf_prog_pack vmalloc: WARN for set_vm_flush_reset_perms() on huge pages vmalloc: introduce huge_vmalloc_supported bpf: simplify select_bpf_prog_pack_size arch/x86/include/asm/text-patching.h | 1 + arch/x86/kernel/alternative.c | 67 +++++++++++++++++++++++----- arch/x86/kernel/module.c | 21 +++++++++ arch/x86/net/bpf_jit_comp.c | 5 +++ include/linux/bpf.h | 1 + include/linux/moduleloader.h | 5 +++ include/linux/vmalloc.h | 7 +++ kernel/bpf/core.c | 43 ++++++++++-------- kernel/module.c | 8 ++++ mm/vmalloc.c | 5 +++ 10 files changed, 134 insertions(+), 29 deletions(-) -- 2.30.2