Re: [RFC bpf-next v2 2/9] bpf: no_caller_saved_registers attribute for helper calls

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2024-07-10 at 10:50 -0700, Andrii Nakryiko wrote:

[...]

> > > Ok, I see, this wasn't obvious that you want this behavior. I actually
> > > find it surprising (and so at least let's call this possibility out).
> > > 
> > > But, tbh, I'd make this stricter. I'd dictate that within a subprogram
> > > all no_csr patterns *should* end with the same stack offset and I
> > > would check for that (so what I mentioned before that instead of min
> > > or max we need assignment and check for equality if we already
> > > assigned it once).
> > 
> > This makes sense, but does not cover a theoretical corner case:
> > - suppose there are two nocsr functions A and B on the kernel side,
> >   but only was A marked as nocsr during program compilation;
> > - the spill/fill "bracket" was generated for a call to function B by
> >   the compiler (because this is just a valid codegen).
> 
> In this case the call to A shouldn't be changing nocsr_offset at all,
> though? You should find no spill/fill and thus even though A is
> *allowed* to use no_csr, it doesn't, so it should have no effect on
> nocsr offsets, no?

Consider the following code:

    *(u64 *)(r10 - 24) = r1;
    call A                     // kernel and clang know that A is nocsr
    r1 = *(u64 *)(r10 - 24);   // kernel assumes .nocsr_stack_offset -24
    ...
    *(u64 *)(r10 - 16) = r1;
    call B                     // only kernel knows that B is nocsr
    r1 = *(u64 *)(r10 - 16);   // with stricter logic this would disable
                               // nocsr for the whole sub-probram

> But your example actually made me think about another (not theoretical
> at all) case. What if we have calls to A and B, the kernel is slightly
> old and knows that B is nocsr, but A is not. But the BPF program was
> compiled with the latest helper/kfunc definitions marking A as no_csr
> eligible (as well as B). (btw, THAT'S THE WORD for allow_csr --
> ELIGIBLE. csr_eligible FTW! but I digress...)
> 
> With the case above we'll disable nocsr for both A and B, right? That
> sucks, but not sure if we can do anything about that. (We can probably
> assume no_csr pattern and thus allow spill/fill and not disable nocsr
> in general, but not remove spill/fills... a bit more complication
> upfront for longer term extensibility.. not sure, maybe performance
> regression is a fine price, hmm)
> 
> So I take it back about unmarking csr in the *entire BPF program*,
> let's just limit it to the subprog scope. But I still think we should
> do it eagerly, rather than double checking in do_misc_followups().
> WDYT?

With current algo this situation would disable nocsr indeed.
The problem is that checks for spilled stack slots usage is too simplistic.
However, it could be covered if the check is performed using e.g.
process I described earlier:
- for spill, remember the defining instruction in the stack slot structure;
- for fill, "merge" the defining instruction index;
- for other stack access mark defining instruction as escaping.

> > So I wanted to keep the nocsr check a little bit more permissive.
> > 
> > > Also, instead of doing that extra nocsr offset check in
> > > do_misc_fixups(), why don't we just reset all no_csr patterns within a
> > > subprogram *if we find a single violation*. Compiler shouldn't ever
> > > emit such code, right? So let's be strict and fallback to not
> > > recognizing nocsr.
> > > 
> > > And then we won't need that extra check in do_misc_fixups() because we
> > > eagerly unset no_csr flag and will never hit that piece of logic in
> > > patching.
> > 
> > I can do that, but the detector pass would have to be two pass:
> > - on the first pass, find the nocsr_stack_off, add candidate insn marks;
> > - on the second pass, remove marks from insns with wrong stack access offset.
> 
> It's not really a second pass, it's part of normal validation.
> check_nocsr_stack_contract() will detect this and will do, yes, pass
> over all instructions of a subprogram to unmark them.

- on the first pass true .nocsr_stack_off is not yet known,
  so .nocsr_pattern is set optimistically;
- on the second pass .nocsr_stack_off is already known,
  so .nocsr_pattern can be removed from spills/fills outside the range;
- check_nocsr_stack_contract() does not need to scan full sub-program
  on each violation, it can set a flag disabling nocsr in subprogram info.

> > Otherwise I can't discern true patterns from false positives in
> > situation described above.
> 
> See above, I might be missing what your "theoretical" example changes.
> 
> > 
> > > I'd go even further and say that on any nocsr invariant violation, we
> > > just go and reset all nocsr pattern flags across entire BPF program
> > > (all subprogs including). In check_nocsr_stack_contract() I mean. Just
> > > have a loop over all instructions and set that flag to false.
> > > 
> > > Why not? What realistic application would need to violate nocsr in
> > > some subprogs but not in others?
> > > 
> > > KISS or it's too simplistic for some reason?
> > 
> > I can invalidate nocsr per-program.
> > If so, I would use an "allow_nocsr" flag in program aux to avoid
> > additional passes over instructions.
> 
> yep, see above (but let's keep it per-subprog for now)

Ok

> > > > > > bpf_patch_insn_data() assumes that one instruction has to be replaced with many.
> > > > > > Here I need to replace many instructions with a single instruction.
> > > > > > I'd prefer not to tweak bpf_patch_insn_data() for this patch-set.
> > > > > 
> > > > > ah, bpf_patch_insn_data just does bpf_patch_insn_single, somehow I
> > > > > thought that it does range for range replacement of instructions.
> > > > > Never mind then (it's a bit sad that we don't have a proper flexible
> > > > > and powerful patching primitive, though, but oh well).
> > > > 
> > > > That is true.
> > > > I'll think about it once other issues with this patch-set are resolved.
> > > 
> > > yeah, it's completely unrelated, but having a bpf_patch_insns that
> > > takes input instruction range to be replaced and replacing it with
> > > another set of instructions would cover all existing use cases and
> > > some more. We wouldn't need verifier_remove_insns(), because that's
> > > just replacing existing instructions with empty new set of
> > > instructions. We would now have "insert instructions" primitive if we
> > > specify empty input range of instructions. If we add a flag whether to
> > > adjust jump offsets and make it configurable, we'd need the thing that
> > > Alexei needed for may_goto without any extra hacks.
> > > 
> > > Furthermore, I find it wrong and silly that we keep having manual
> > > delta+insn adjustments in every single piece of patching logic. I
> > > think this should be done by bpf_patch_insns(). We need to have a
> > > small "insn_patcher" struct that we use during instruction patching,
> > > and a small instruction iterator on top that would keep returning next
> > > instruction to process (and its index, probably). This will allow us
> > > to optimize patching and instead of doing O(N) we can have something
> > > faster and smarter (if we hide direct insn array accesses during
> > > patching).
> > > 
> > > But this is all a completely separate story from all of this.
> > 
> > I actually suggested something along these lines very long time ago:
> > https://lore.kernel.org/bpf/5c297e5009bcd4f0becf59ccd97cfd82bb3a49ec.camel@xxxxxxxxx/
> 
> We had a few discussions around patching improvements (you can see I
> mentioned that in that thread as well). We all like to talk about
> improving this, but not really doing anything about this. :) Tech debt
> at its best.

If you are not completely opposed to that idea I can follow-up with a
complete version as an RFC.





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux