Eduard Zingerman <eddyz87@xxxxxxxxx> writes: > GCC and LLVM define a no_caller_saved_registers function attribute. > This attribute means that function scratches only some of > the caller saved registers defined by ABI. > For BPF the set of such registers could be defined as follows: > - R0 is scratched only if function is non-void; > - R1-R5 are scratched only if corresponding parameter type is defined > in the function prototype. > > This commit introduces flag bpf_func_prot->nocsr. > If this flag is set for some helper function, verifier assumes that > it follows no_caller_saved_registers calling convention. > > The contract between kernel and clang allows to simultaneously use > such functions and maintain backwards compatibility with old > kernels that don't understand no_caller_saved_registers calls > (nocsr for short): > > - clang generates a simple pattern for nocsr calls, e.g.: > > r1 = 1; > r2 = 2; > *(u64 *)(r10 - 8) = r1; > *(u64 *)(r10 - 16) = r2; > call %[to_be_inlined_by_jit] The call can be inlined by the verifier (in do_misc_fixups()) or by the JIT if bpf_jit_inlines_helper_call() is true for a helper. > r2 = *(u64 *)(r10 - 16); > r1 = *(u64 *)(r10 - 8); > r0 = r1; > r0 += r2; > exit; > > - kernel removes unnecessary spills and fills, if called function is > inlined by current JIT (with assumption that patch inserted by JIT > honors nocsr contract, e.g. does not scratch r3-r5 for the example > above), e.g. the code above would be transformed to: > > r1 = 1; > r2 = 2; > call %[to_be_inlined_by_jit] Same as above > r0 = r1; > r0 += r2; > exit; > [.....] > +/* True if do_misc_fixups() replaces calls to helper number 'imm', > + * replacement patch is presumed to follow no_caller_saved_registers contract > + * (see match_and_mark_nocsr_pattern() below). > + */ > +static bool verifier_inlines_helper_call(struct bpf_verifier_env *env, s32 imm) > +{ > + return false; > +} > + > +/* If 'insn' is a call that follows no_caller_saved_registers contract > + * and called function is inlined by current jit, return a mask with We should update the comment to: inlined by the verifier or by the current JIT. > + * 1s corresponding to registers that are scratched by this call > + * (depends on return type and number of return parameters). > + * Otherwise return ALL_CALLER_SAVED_REGS mask. > + */ > +static u32 call_csr_mask(struct bpf_verifier_env *env, struct bpf_insn *insn) > +{ > + const struct bpf_func_proto *fn; > + > + if (bpf_helper_call(insn) && > + verifier_inlines_helper_call(env, insn->imm) && This should also check bpf_jit_inlines_helper_call(insn->imm) as the JIT can also inline helper calls separately from the verifier. if (bpf_helper_call(insn) && (verifier_inlines_helper_call(env, insn->imm) || bpf_jit_inlines_helper_call(insn->imm)) && This is currently being done by the arm64 and risc-v JITs and they don't scratch any register except R0 (The helpers inlined by these JITs are non-void). > + get_helper_proto(env, insn->imm, &fn) == 0 && > + fn->nocsr) > + return ~get_helper_reg_mask(fn); > + > + return ALL_CALLER_SAVED_REGS; > +} > + > +/* GCC and LLVM define a no_caller_saved_registers function attribute. > + * This attribute means that function scratches only some of > + * the caller saved registers defined by ABI. > + * For BPF the set of such registers could be defined as follows: > + * - R0 is scratched only if function is non-void; > + * - R1-R5 are scratched only if corresponding parameter type is defined > + * in the function prototype. > + * > + * The contract between kernel and clang allows to simultaneously use > + * such functions and maintain backwards compatibility with old > + * kernels that don't understand no_caller_saved_registers calls > + * (nocsr for short): > + * > + * - for nocsr calls clang allocates registers as-if relevant r0-r5 > + * registers are not scratched by the call; > + * > + * - as a post-processing step, clang visits each nocsr call and adds > + * spill/fill for every live r0-r5; > + * > + * - stack offsets used for the spill/fill are allocated as minimal > + * stack offsets in whole function and are not used for any other > + * purposes; > + * > + * - when kernel loads a program, it looks for such patterns > + * (nocsr function surrounded by spills/fills) and checks if > + * spill/fill stack offsets are used exclusively in nocsr patterns; > + * > + * - if so, and if current JIT inlines the call to the nocsr function We should update the comment to: if the verifier or the current JIT. > + * (e.g. a helper call), kernel removes unnecessary spill/fill pairs; > + * > + * - when old kernel loads a program, presence of spill/fill pairs > + * keeps BPF program valid, albeit slightly less efficient. > + * > + * For example: > + * > + * r1 = 1; > + * r2 = 2; > + * *(u64 *)(r10 - 8) = r1; r1 = 1; > + * *(u64 *)(r10 - 16) = r2; r2 = 2; > + * call %[to_be_inlined_by_jit] --> call %[to_be_inlined_by_jit] We should update this to say: to_be_inlined_by_verifier_or_jit > + * r2 = *(u64 *)(r10 - 16); r0 = r1; > + * r1 = *(u64 *)(r10 - 8); r0 += r2; > + * r0 = r1; exit; > + * r0 += r2; > + * exit; > + * > + * The purpose of match_and_mark_nocsr_pattern is to: > + * - look for such patterns; > + * - mark spill and fill instructions in env->insn_aux_data[*].nocsr_pattern; > + * - update env->subprog_info[*]->nocsr_stack_off to find an offset > + * at which nocsr spill/fill stack slots start. > + * > + * The .nocsr_pattern and .nocsr_stack_off are used by > + * check_nocsr_stack_contract() to check if every stack access to > + * nocsr spill/fill stack slot originates from spill/fill > + * instructions, members of nocsr patterns. > + * > + * If such condition holds true for a subprogram, nocsr patterns could > + * be rewritten by remove_nocsr_spills_fills(). > + * Otherwise nocsr patterns are not changed in the subprogram > + * (code, presumably, generated by an older clang version). > + * > + * For example, it is *not* safe to remove spill/fill below: > + * > + * r1 = 1; > + * *(u64 *)(r10 - 8) = r1; r1 = 1; > + * call %[to_be_inlined_by_jit] --> call %[to_be_inlined_by_jit] Same as above. Thanks for working on this feature. Most of my comments except for bpf_jit_inlines_helper_call() are nit picks regarding the comments or the commit message, but as the verifier is already complicated, I feel it is useful to have extremely accurate comments/commit messages for new contributors. Thanks, Puranjay
Attachment:
signature.asc
Description: PGP signature