On Thu, May 2, 2024 at 10:37 AM Puranjay Mohan <puranjay@xxxxxxxxxx> wrote: > > > Hi Everyone, > > While working on inlining bpf_get_smp_processor_id() in the ARM64 and > RISCV JITs, I realized that these archs allow such optimizations because > they keep some information like the per-cpu offset or the pointer to the > task_struct in special system registers. > > So, I went through the list of all BPF helpers and made a list of > helpers that we can inline in these JITs to make their usage much more > optimized: > > I. ARM64 and RISC-V specific optimzations if inlined: > > A) Because pointer to tast_struct is available in a register: > 1. bpf_get_current_pid_tgid() > 2. bpf_get_current_task() These two are used really frequently, so it might make sense to optimize them (and also bpf_get_current_task_btf(), of course), if others agree with me. > 3. bpf_set_retval() > 4. bpf_get_retval() > 5. bpf_task_pt_regs() I'm leaning towards saying that probably not, unless we have a really good reason to. Inlining is not free in terms of code maintenance and complexity, so I wouldn't go and inline everything possible. But maybe others have another opinion. > 6. bpf_get_attach_cookie() definitely no, there are multiple implementations depending on specific program type > > B) Because per_cpu offset is available in a register: > 1. bpf_this_cpu_ptr() maybe, but I don't think we inline at BPF instruction level, so inlining in BPF JIT seems premature > 2. bpf_get_numa_node_id() I'm not sure how actively this is used, so I'd say no to this one as well. > > These can be inlined in the verifier too using the newly > introduced per-cpu instruction. yep, I'd start with doing BPF assembly inlining for bpf_this_cpu_ptr/bpf_per_cpu_ptr, tbh > > II. These are very basic writes, can be inlined in the verifier or the JIT: > 1. bpf_msg_apply_bytes() > 2. bpf_msg_cork_bytes() > 3. bpf_set_hash_invalid() I'd say this is also going overboard with inlining. > > I will first try to inline all these in the ARM64 JIT and see the > performance improvement. I am not sure what would be the best way to > benchmark all of this inlining. > > Andrii, can you suggest something for the benchmarking? > > Looking forward to your thoughts on this. > > Thanks, > Puranjay