On Tue, 2024-07-09 at 23:01 -0700, Andrii Nakryiko wrote: [...] > > Right, it should be '<', my bad, will update the comment. > > See, I do read comments ;) Yes, sorry about that remark. [...] > > > > +/* See comment for mark_nocsr_pattern_for_call() */ > > > > +static void check_nocsr_stack_contract(struct bpf_verifier_env *env, struct bpf_func_state *state, > > > > + int insn_idx, int off) > > > > +{ > > > > + struct bpf_subprog_info *subprog = &env->subprog_info[state->subprogno]; > > > > + struct bpf_insn_aux_data *aux = &env->insn_aux_data[insn_idx]; > > > > + > > > > + if (subprog->nocsr_stack_off <= off || aux->nocsr_pattern) > > > > + return; > > > > > > can helper call instruction go through this check? E.g., if we do > > > bpf_probe_read_kernel() into stack slot, where do we check that that > > > slot is not overlapping with nocsr spill/fill region? > > > > In check_helper_call() we do check_mem_access() that eventually calls > > one of the check_stack_{read,write}_{fixed,varying}_off(). > > The .access_size should be set for bpf_probe_read_kernel() > > because it's argument base type is ARG_PTR_TO_MEM. > > I will add a test case to double-check this. > > Ok. Also as I was reading this I didn't yet realize that > aux->nocsr_pattern is not set for call instruction itself, so I was > worried that we might accidentally skip the check. But we don't set > nocsr_pattern for call, so we should be good (but the test never > hurts) Well yes, there is a flag check, but call to bpf_probe_read_kernel() touching candidate nocsr stack address should disable nocsr rewrites for current subprogram. [...] > > > > +static u32 call_csr_mask(struct bpf_verifier_env *env, struct bpf_insn *insn) > > > > > > you use u8 for get_helper_reg_mask() and u32 here, why not keep them consistent? > > > > Ok > > > > > similar to the naming nit above, I think we should be a bit more > > > explicit with what "mask" actually means. Is this also clobber mask? > > > > I mean, there is a comment right above the function. > > This function returns a mask of caller saved registers (csr). > > I'll make the name more explicit. > > > > > > > > > +{ > > > > + const struct bpf_func_proto *fn; > > > > + > > > > + if (bpf_helper_call(insn) && > > > > + (verifier_inlines_helper_call(env, insn->imm) || bpf_jit_inlines_helper_call(insn->imm)) && > > > > + get_helper_proto(env, insn->imm, &fn) == 0 && > > > > + fn->allow_nocsr) > > > > + return ~get_helper_reg_mask(fn); > > > > > > hm... I'm a bit confused why we do a negation here? aren't we working > > > with clobbering mask... I'll keep reading for now. > > > > Please read the comment before the function. > > > > Believe it or not, but I read it like 10 times and that didn't help me much. > > +/* If 'insn' is a call that follows no_caller_saved_registers contract > + * and called function is inlined by current jit or verifier, > + * return a mask with 1s corresponding to registers that are scratched > + * by this call (depends on return type and number of return parameters). > + * Otherwise return ALL_CALLER_SAVED_REGS mask. > > "registers that are scratched by this call" would be what > get_helper_reg_mask() returns for the function (at least that's my > reading of the above), and yet you return inverted mask. It doesn't > really help that we return ALL_CALLER_SAVED_REGS as "nope, it's not > nocsr call". I see, I messed up the comment... > Maybe it would be a bit easier to follow if call_csr_mask() returned > bool and mask as an out parameter in case it's nocsr call. So instead > of special casing ALL_CALLER_SAVED_REGS there would be a nice "not a > nocsr, never mind" and early exit. Out parameter seems to be an overkill. I'll change names a little bit to make it easier to follow. If for v3 you'd still think that out parameter is better I'll switch then. > Anyways, perhaps I'm just being very dense here, I just found this > particular piece extremely hard to follow intuitively. > > > > > > > > + > > > > + return ALL_CALLER_SAVED_REGS; > > > > +} > > > > [...] > > > > > > +static void mark_nocsr_pattern_for_call(struct bpf_verifier_env *env, int t) > > > > +{ > > > > + struct bpf_insn *insns = env->prog->insnsi, *stx, *ldx; > > > > + struct bpf_subprog_info *subprog; > > > > + u32 csr_mask = call_csr_mask(env, &insns[t]); > > > > + u32 reg_mask = ~csr_mask | ~ALL_CALLER_SAVED_REGS; > > > > > > tbh, I'm lost with all these bitmask and their inversions... > > > call_csr_mask()'s result is basically always used inverted, so why not > > > return inverted mask in the first place? > > > > The mask is initialized as a set of all registers preserved by this call. > > ok, maybe it's just a mix of "no csr" and "csr" that confuses me. How > about we call call_csr_mask as get_helper_preserved_mask() or > something like that to "match" get_helper_clobber_mask()? Yes, these two names are good, thank you. I'd still call it get_call_preserved_mask() as kfuncs might be added at some point. > > Those that are not in mask need a spill/fill pair. > > I'll toss things around to make this more clear. > > (naming, comments, maybe move the '| ~ALL_CALLER_SAVED_REGS' to the call_csr_mask()). > > > > > > > > > + int s, i; > > > > + s16 off; > > > > + > > > > + if (csr_mask == ALL_CALLER_SAVED_REGS) > > > > + return; > > > > + > > > > + for (i = 1, off = 0; i <= ARRAY_SIZE(caller_saved); ++i, off += BPF_REG_SIZE) { > > > > + if (t - i < 0 || t + i >= env->prog->len) > > > > + break; > > > > + stx = &insns[t - i]; > > > > + ldx = &insns[t + i]; > > > > + if (off == 0) { > > > > + off = stx->off; > > > > + if (off % BPF_REG_SIZE != 0) > > > > + break; > > > > > > kind of ugly that we assume stx before we actually checked that it's > > > STX?... maybe split humongous if below into instruction checking > > > (with code and src_reg) and then off checking separately? > > > > Don't see anything ugly about this, tbh. > > you are using stx->off and do `% BPF_REG_SIZE` check on it while that > stx->off field might be meaningless for the instruction (because we > are not yet sure it's STX instruction), that's a bit ugly, IMO I can rewrite it like below: if (stx->code != (BPF_STX | BPF_MEM | BPF_DW) || ldx->code != (BPF_LDX | BPF_MEM | BPF_DW)) break; off = off != 0 ?: stx->off; // or use full 'if' block from the old version if (stx->dst_reg != BPF_REG_10 || ldx->src_reg != BPF_REG_10 || /* check spill/fill for the same reg and offset */ stx->src_reg != ldx->dst_reg || stx->off != ldx->off || stx->off != off || (off % BPF_REG_SIZE) != 0 || /* this should be a previously unseen register */ (BIT(stx->src_reg) & reg_mask)) break; But I'm not sure this adds any actual clarity. > > > Can split the 'if' statement, if you think it's hard to read. > > it's not about readability for me > > > > > > > > > > + } > > > > + if (/* *(u64 *)(r10 - off) = r[0-5]? */ > > > > + stx->code != (BPF_STX | BPF_MEM | BPF_DW) || > > > > + stx->dst_reg != BPF_REG_10 || > > > > + /* r[0-5] = *(u64 *)(r10 - off)? */ > > > > + ldx->code != (BPF_LDX | BPF_MEM | BPF_DW) || > > > > + ldx->src_reg != BPF_REG_10 || > > > > + /* check spill/fill for the same reg and offset */ > > > > + stx->src_reg != ldx->dst_reg || > > > > + stx->off != ldx->off || > > > > + stx->off != off || > > > > + /* this should be a previously unseen register */ > > > > + BIT(stx->src_reg) & reg_mask) > > > > > > () around & operation? > > > > No need, & has higher priority over ||. > > You can check the AST using > > https://tree-sitter.github.io/tree-sitter/playground . > > Oh, I have no doubt you checked that it works correctly. It's just not > a really good habit to rely on C's obscure operation ordering rules > beyond A && B || C (and even then it is arguable). I think the rule of > thumb to not mix bitwise and logic operators without explicit > parenthesis makes sense. > > But never mind, I already feel weird complaining about !strcmp(), > every real kernel engineer should memorize operator precedence by > heart. I assumed you implied incorrect evaluation order. Will add parenthesis. > > > > + break; > > > > + reg_mask |= BIT(stx->src_reg); > > > > + env->insn_aux_data[t - i].nocsr_pattern = 1; > > > > + env->insn_aux_data[t + i].nocsr_pattern = 1; > > > > + } > > > > + if (i == 1) > > > > + return; > > > > + env->insn_aux_data[t].nocsr_spills_num = i - 1; > > > > + s = find_containing_subprog(env, t); > > > > + /* can't happen */ [...] > > > > + if (WARN_ON_ONCE(s < 0)) > > > > + return; > > > > + subprog = &env->subprog_info[s]; > > > > + subprog->nocsr_stack_off = min(subprog->nocsr_stack_off, off); > > > > > > should this be max()? offsets are negative, right? so if nocsr uses -8 > > > and -16 as in the example, entire [-16, 0) region is nocsr region > > > > This should be min exactly because stack offsets are negative. > > For the example above the 'off' is initialized as -16 and then > > is incremented by +8 giving final value of -8. > > And I need to select the minimal value used between several patterns. > > so let's say I have two nocsr calls in the same subprog > > *(u64 *)(r10 - 8) = r1; > *(u64 *)(r10 - 16) = r2; > call %[to_be_inlined] > r2 = *(u64 *)(r10 - 16); > r1 = *(u64 *)(r10 - 8); > > > and then a bit later > > > *(u64 *)(r10 - 16) = r1; > *(u64 *)(r10 - 24) = r2; > call %[to_be_inlined] > r2 = *(u64 *)(r10 - 24); > r1 = *(u64 *)(r10 - 16); > > > For the first chunk nocsr range is [-16, 0). For the second it's [-24, > -8), right? > Should `*(u64 *)(r10 - 8) = 123` somewhere in that subprog (not for > nocsr calls) invalidate this whole nocsr thing? With min you'll > basically have [-24, -8) as nocsr-reserved region, but shouldn't it be > whole [-24, 0)? > > Clang shouldn't generate such inconsistent offset, right? But it's > legal, no? And if not, then all the calls should use the same upper > stack boundary and we shouldn't be doing min/max, but rather checking > that they are all consistent. > > Or what am I missing again? You don't miss anything. With the current algorithm first call will not be rewritten because of the offset check in do_misc_fixups(). The second call would be rewritten and this is safe to do, because verifier knows that range [-24,-8) is only used by nocsr patterns. What you suggest is slightly more pessimistic, compared to current algorithm. [...] > > > > + > > > > + /* apply the rewrite: > > > > + * *(u64 *)(r10 - X) = rY ; num-times > > > > + * call() -> call() > > > > + * rY = *(u64 *)(r10 - X) ; num-times > > > > + */ > > > > + err = verifier_remove_insns(env, i + delta - spills_num, spills_num); > > > > + if (err) > > > > + return err; > > > > + err = verifier_remove_insns(env, i + delta - spills_num + 1, spills_num); > > > > + if (err) > > > > + return err; > > > > > > why not a single bpf_patch_insn_data()? > > > > bpf_patch_insn_data() assumes that one instruction has to be replaced with many. > > Here I need to replace many instructions with a single instruction. > > I'd prefer not to tweak bpf_patch_insn_data() for this patch-set. > > ah, bpf_patch_insn_data just does bpf_patch_insn_single, somehow I > thought that it does range for range replacement of instructions. > Never mind then (it's a bit sad that we don't have a proper flexible > and powerful patching primitive, though, but oh well). That is true. I'll think about it once other issues with this patch-set are resolved. [...] > > > > + > > > > + i += spills_num - 1; > > > > + /* ^ ^ do a second visit of this instruction, > > > > + * | '-- so that verifier can inline it > > > > + * '--------------- jump over deleted fills > > > > + */ > > > > + delta -= 2 * spills_num; > > > > + insn = env->prog->insnsi + i + delta; > > > > + goto next_insn; > > > > > > why not adjust the state and just fall through, what goto next_insn > > > does that we can't (and next instruction is misleading, so I'd rather > > > fix up and move forward) > > > > I don't like this. The fall-through makes control flow more convoluted. > > To understand what would happen next: > > - with goto next_insn we just start over; > > - with fall-through we need to think about position of this particular > > 'if' statement within the loop. > > > > Alright, it's fine. Though I don't see anything wrong with knowing > that we patch up nocsr pattern before we do inlining of calls. And > yes, order is important, so what? I always assumed that the order of > operations in do_misc_fixups() is not arbitrary. On a second thought maybe fall-through is not that bad. [...]