Most BPF programs are small, but they consume a page each. For systems with busy traffic and many BPF programs, this may also add significant pressure on instruction TLB. High iTLB pressure usually slows down the whole system causing visible performance degradation for production workloads. bpf_prog_pack, a customized allocator that packs multiple bpf programs into preallocated memory chunks, was proposed [1] to address it. This series extends this support on powerpc. The first patch introduces patch_instructions() function to enable patching more than one instruction at a time. This change showed around 5X improvement in the time taken to run test_bpf test cases. Patches 2 & 3 add the arch specific functions needed to support this feature. Patch 4 enables the support for powerpc and ensures cleanup is handled racefully. Tested the changes successfully. [1] https://lore.kernel.org/bpf/20220204185742.271030-1-song@xxxxxxxxxx/ [2] https://lore.kernel.org/all/20221110184303.393179-1-hbathini@xxxxxxxxxxxxx/ Changes in v2: * Introduced patch_instructions() to help with patching bpf programs. Hari Bathini (4): powerpc/code-patching: introduce patch_instructions() powerpc/bpf: implement bpf_arch_text_copy powerpc/bpf: implement bpf_arch_text_invalidate for bpf_prog_pack powerpc/bpf: use bpf_jit_binary_pack_[alloc|finalize|free] arch/powerpc/include/asm/code-patching.h | 1 + arch/powerpc/lib/code-patching.c | 151 ++++++++++++++++------- arch/powerpc/net/bpf_jit.h | 7 +- arch/powerpc/net/bpf_jit_comp.c | 142 ++++++++++++++++----- arch/powerpc/net/bpf_jit_comp32.c | 4 +- arch/powerpc/net/bpf_jit_comp64.c | 6 +- 6 files changed, 226 insertions(+), 85 deletions(-) -- 2.39.2