On Wed, Jul 10, 2024 at 9:15 AM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote: > > On Wed, 2024-07-10 at 08:36 -0700, Andrii Nakryiko wrote: > > [...] > > > > I can rewrite it like below: > > > > > > if (stx->code != (BPF_STX | BPF_MEM | BPF_DW) || > > > ldx->code != (BPF_LDX | BPF_MEM | BPF_DW)) > > > > I'd add stx->dst_reg != BPF_REG_10 and ldx->src_reg != BPF_REG_10 > > checks here and preserve original comments with instruction assembly > > form. > > > > (think about this as establishing that we are looking at the correct > > shapes of instructions) > > > > > break; > > > off = off != 0 ?: stx->off; // or use full 'if' block from the old version > > > if (stx->dst_reg != BPF_REG_10 || > > > ldx->src_reg != BPF_REG_10 || > > > /* check spill/fill for the same reg and offset */ > > > stx->src_reg != ldx->dst_reg || > > > stx->off != ldx->off || > > > stx->off != off || > > > (off % BPF_REG_SIZE) != 0 || > > > /* this should be a previously unseen register */ > > > (BIT(stx->src_reg) & reg_mask)) > > > > and here we are checking all the correctness and additional imposed > > semantical invariants knowing full well that we are dealing with > > correct STX/LDX shapes > > > > > break; > > > > > > But I'm not sure this adds any actual clarity. > > > > I think it makes sense. > > Ok, will change. > > [...] > > > Ok, I see, this wasn't obvious that you want this behavior. I actually > > find it surprising (and so at least let's call this possibility out). > > > > But, tbh, I'd make this stricter. I'd dictate that within a subprogram > > all no_csr patterns *should* end with the same stack offset and I > > would check for that (so what I mentioned before that instead of min > > or max we need assignment and check for equality if we already > > assigned it once). > > This makes sense, but does not cover a theoretical corner case: > - suppose there are two nocsr functions A and B on the kernel side, > but only was A marked as nocsr during program compilation; > - the spill/fill "bracket" was generated for a call to function B by > the compiler (because this is just a valid codegen). In this case the call to A shouldn't be changing nocsr_offset at all, though? You should find no spill/fill and thus even though A is *allowed* to use no_csr, it doesn't, so it should have no effect on nocsr offsets, no? But your example actually made me think about another (not theoretical at all) case. What if we have calls to A and B, the kernel is slightly old and knows that B is nocsr, but A is not. But the BPF program was compiled with the latest helper/kfunc definitions marking A as no_csr eligible (as well as B). (btw, THAT'S THE WORD for allow_csr -- ELIGIBLE. csr_eligible FTW! but I digress...) With the case above we'll disable nocsr for both A and B, right? That sucks, but not sure if we can do anything about that. (We can probably assume no_csr pattern and thus allow spill/fill and not disable nocsr in general, but not remove spill/fills... a bit more complication upfront for longer term extensibility.. not sure, maybe performance regression is a fine price, hmm) So I take it back about unmarking csr in the *entire BPF program*, let's just limit it to the subprog scope. But I still think we should do it eagerly, rather than double checking in do_misc_followups(). WDYT? > > So I wanted to keep the nocsr check a little bit more permissive. > > > Also, instead of doing that extra nocsr offset check in > > do_misc_fixups(), why don't we just reset all no_csr patterns within a > > subprogram *if we find a single violation*. Compiler shouldn't ever > > emit such code, right? So let's be strict and fallback to not > > recognizing nocsr. > > > > And then we won't need that extra check in do_misc_fixups() because we > > eagerly unset no_csr flag and will never hit that piece of logic in > > patching. > > I can do that, but the detector pass would have to be two pass: > - on the first pass, find the nocsr_stack_off, add candidate insn marks; > - on the second pass, remove marks from insns with wrong stack access offset. It's not really a second pass, it's part of normal validation. check_nocsr_stack_contract() will detect this and will do, yes, pass over all instructions of a subprogram to unmark them. > > Otherwise I can't discern true patterns from false positives in > situation described above. See above, I might be missing what your "theoretical" example changes. > > > I'd go even further and say that on any nocsr invariant violation, we > > just go and reset all nocsr pattern flags across entire BPF program > > (all subprogs including). In check_nocsr_stack_contract() I mean. Just > > have a loop over all instructions and set that flag to false. > > > > Why not? What realistic application would need to violate nocsr in > > some subprogs but not in others? > > > > KISS or it's too simplistic for some reason? > > I can invalidate nocsr per-program. > If so, I would use an "allow_nocsr" flag in program aux to avoid > additional passes over instructions. yep, see above (but let's keep it per-subprog for now) > > [...] > > > > > > bpf_patch_insn_data() assumes that one instruction has to be replaced with many. > > > > > Here I need to replace many instructions with a single instruction. > > > > > I'd prefer not to tweak bpf_patch_insn_data() for this patch-set. > > > > > > > > ah, bpf_patch_insn_data just does bpf_patch_insn_single, somehow I > > > > thought that it does range for range replacement of instructions. > > > > Never mind then (it's a bit sad that we don't have a proper flexible > > > > and powerful patching primitive, though, but oh well). > > > > > > That is true. > > > I'll think about it once other issues with this patch-set are resolved. > > > > yeah, it's completely unrelated, but having a bpf_patch_insns that > > takes input instruction range to be replaced and replacing it with > > another set of instructions would cover all existing use cases and > > some more. We wouldn't need verifier_remove_insns(), because that's > > just replacing existing instructions with empty new set of > > instructions. We would now have "insert instructions" primitive if we > > specify empty input range of instructions. If we add a flag whether to > > adjust jump offsets and make it configurable, we'd need the thing that > > Alexei needed for may_goto without any extra hacks. > > > > Furthermore, I find it wrong and silly that we keep having manual > > delta+insn adjustments in every single piece of patching logic. I > > think this should be done by bpf_patch_insns(). We need to have a > > small "insn_patcher" struct that we use during instruction patching, > > and a small instruction iterator on top that would keep returning next > > instruction to process (and its index, probably). This will allow us > > to optimize patching and instead of doing O(N) we can have something > > faster and smarter (if we hide direct insn array accesses during > > patching). > > > > But this is all a completely separate story from all of this. > > I actually suggested something along these lines very long time ago: > https://lore.kernel.org/bpf/5c297e5009bcd4f0becf59ccd97cfd82bb3a49ec.camel@xxxxxxxxx/ We had a few discussions around patching improvements (you can see I mentioned that in that thread as well). We all like to talk about improving this, but not really doing anything about this. :) Tech debt at its best.