Re: [RFC bpf-next v2 2/9] bpf: no_caller_saved_registers attribute for helper calls

Eduard Zingerman <eddyz87@xxxxxxxxx> · Wed, 10 Jul 2024 09:15:00 -0700

On Wed, 2024-07-10 at 08:36 -0700, Andrii Nakryiko wrote:

[...]

> > I can rewrite it like below:
> > 
> >                 if (stx->code != (BPF_STX | BPF_MEM | BPF_DW) ||
> >                     ldx->code != (BPF_LDX | BPF_MEM | BPF_DW))
> 
> I'd add stx->dst_reg != BPF_REG_10 and ldx->src_reg != BPF_REG_10
> checks here and preserve original comments with instruction assembly
> form.
> 
> (think about this as establishing that we are looking at the correct
> shapes of instructions)
> 
> >                         break;
> >                 off = off != 0 ?: stx->off; // or use full 'if' block from the old version
> >                 if (stx->dst_reg != BPF_REG_10 ||
> >                     ldx->src_reg != BPF_REG_10 ||
> >                     /* check spill/fill for the same reg and offset */
> >                     stx->src_reg != ldx->dst_reg ||
> >                     stx->off != ldx->off ||
> >                     stx->off != off ||
> >                     (off % BPF_REG_SIZE) != 0 ||
> >                     /* this should be a previously unseen register */
> >                     (BIT(stx->src_reg) & reg_mask))
> 
> and here we are checking all the correctness and additional imposed
> semantical invariants knowing full well that we are dealing with
> correct STX/LDX shapes
> 
> >                         break;
> > 
> > But I'm not sure this adds any actual clarity.
> 
> I think it makes sense.

Ok, will change.

[...]

> Ok, I see, this wasn't obvious that you want this behavior. I actually
> find it surprising (and so at least let's call this possibility out).
> 
> But, tbh, I'd make this stricter. I'd dictate that within a subprogram
> all no_csr patterns *should* end with the same stack offset and I
> would check for that (so what I mentioned before that instead of min
> or max we need assignment and check for equality if we already
> assigned it once).

This makes sense, but does not cover a theoretical corner case:
- suppose there are two nocsr functions A and B on the kernel side,
  but only was A marked as nocsr during program compilation;
- the spill/fill "bracket" was generated for a call to function B by
  the compiler (because this is just a valid codegen).

So I wanted to keep the nocsr check a little bit more permissive.

> Also, instead of doing that extra nocsr offset check in
> do_misc_fixups(), why don't we just reset all no_csr patterns within a
> subprogram *if we find a single violation*. Compiler shouldn't ever
> emit such code, right? So let's be strict and fallback to not
> recognizing nocsr.
> 
> And then we won't need that extra check in do_misc_fixups() because we
> eagerly unset no_csr flag and will never hit that piece of logic in
> patching.

I can do that, but the detector pass would have to be two pass:
- on the first pass, find the nocsr_stack_off, add candidate insn marks;
- on the second pass, remove marks from insns with wrong stack access offset.

Otherwise I can't discern true patterns from false positives in
situation described above.

> I'd go even further and say that on any nocsr invariant violation, we
> just go and reset all nocsr pattern flags across entire BPF program
> (all subprogs including). In check_nocsr_stack_contract() I mean. Just
> have a loop over all instructions and set that flag to false.
> 
> Why not? What realistic application would need to violate nocsr in
> some subprogs but not in others?
> 
> KISS or it's too simplistic for some reason?

I can invalidate nocsr per-program.
If so, I would use an "allow_nocsr" flag in program aux to avoid
additional passes over instructions.

[...]

> > > > bpf_patch_insn_data() assumes that one instruction has to be replaced with many.
> > > > Here I need to replace many instructions with a single instruction.
> > > > I'd prefer not to tweak bpf_patch_insn_data() for this patch-set.
> > > 
> > > ah, bpf_patch_insn_data just does bpf_patch_insn_single, somehow I
> > > thought that it does range for range replacement of instructions.
> > > Never mind then (it's a bit sad that we don't have a proper flexible
> > > and powerful patching primitive, though, but oh well).
> > 
> > That is true.
> > I'll think about it once other issues with this patch-set are resolved.
> 
> yeah, it's completely unrelated, but having a bpf_patch_insns that
> takes input instruction range to be replaced and replacing it with
> another set of instructions would cover all existing use cases and
> some more. We wouldn't need verifier_remove_insns(), because that's
> just replacing existing instructions with empty new set of
> instructions. We would now have "insert instructions" primitive if we
> specify empty input range of instructions. If we add a flag whether to
> adjust jump offsets and make it configurable, we'd need the thing that
> Alexei needed for may_goto without any extra hacks.
> 
> Furthermore, I find it wrong and silly that we keep having manual
> delta+insn adjustments in every single piece of patching logic. I
> think this should be done by bpf_patch_insns(). We need to have a
> small "insn_patcher" struct that we use during instruction patching,
> and a small instruction iterator on top that would keep returning next
> instruction to process (and its index, probably). This will allow us
> to optimize patching and instead of doing O(N) we can have something
> faster and smarter (if we hide direct insn array accesses during
> patching).
> 
> But this is all a completely separate story from all of this.

I actually suggested something along these lines very long time ago:
https://lore.kernel.org/bpf/5c297e5009bcd4f0becf59ccd97cfd82bb3a49ec.camel@xxxxxxxxx/