Re: On inlining more helpers in the JITs or the verifier

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, May 2, 2024 at 2:22 PM Andrii Nakryiko
<andrii.nakryiko@xxxxxxxxx> wrote:
>
> On Thu, May 2, 2024 at 10:37 AM Puranjay Mohan <puranjay@xxxxxxxxxx> wrote:
> >
> >
> > Hi Everyone,
> >
> > While working on inlining bpf_get_smp_processor_id() in the ARM64 and
> > RISCV JITs, I realized that these archs allow such optimizations because
> > they keep some information like the per-cpu offset or the pointer to the
> > task_struct in special system registers.
> >
> > So, I went through the list of all BPF helpers and made a list of
> > helpers that we can inline in these JITs to make their usage much more
> > optimized:
> >
> > I. ARM64 and RISC-V specific optimzations if inlined:
> >
> >     A) Because pointer to tast_struct is available in a register:
> >         1. bpf_get_current_pid_tgid()
> >         2. bpf_get_current_task()
>
> These two are used really frequently, so it might make sense to
> optimize them (and also bpf_get_current_task_btf(), of course), if
> others agree with me.
>
> >         3. bpf_set_retval()
> >         4. bpf_get_retval()
> >         5. bpf_task_pt_regs()
>
> I'm leaning towards saying that probably not, unless we have a really
> good reason to. Inlining is not free in terms of code maintenance and
> complexity, so I wouldn't go and inline everything possible. But maybe
> others have another opinion.
>
>
> >         6. bpf_get_attach_cookie()
>
> definitely no, there are multiple implementations depending on
> specific program type
>
> >
> >     B) Because per_cpu offset is available in a register:
> >         1. bpf_this_cpu_ptr()
>
> maybe, but I don't think we inline at BPF instruction level, so
> inlining in BPF JIT seems premature
>
>
> >         2. bpf_get_numa_node_id()
>
> I'm not sure how actively this is used, so I'd say no to this one as well.
>
> >
> >         These can be inlined in the verifier too using the newly
> >         introduced per-cpu instruction.
>
> yep, I'd start with doing BPF assembly inlining for
> bpf_this_cpu_ptr/bpf_per_cpu_ptr, tbh
>
> >
> > II. These are very basic writes, can be inlined in the verifier or the JIT:
> >     1. bpf_msg_apply_bytes()
> >     2. bpf_msg_cork_bytes()
> >     3. bpf_set_hash_invalid()
>
> I'd say this is also going overboard with inlining.

+1

simplicity of logic is not a reason to inline it.
I would only inline bpf_get_current_task[_btf]() and do it
in the verifier. JITs should inline only if perf delta
is really significant.
I hope bpf_get_smp_processor_id() will be the only such example.





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux