On Tue, Dec 15, 2020 at 02:28:39PM +0100, Eelco Chaudron wrote: > > > On 9 Dec 2020, at 13:07, Eelco Chaudron wrote: > > > On 9 Dec 2020, at 12:10, Maciej Fijalkowski wrote: > > <SNIP> > > > > > > > + > > > > > > + ctx_reg = (si->src_reg == si->dst_reg) ? scratch_reg - 1 : > > > > > > si->src_reg; > > > > > > + while (dst_reg == ctx_reg || scratch_reg == ctx_reg) > > > > > > + ctx_reg--; > > > > > > + > > > > > > + /* Save scratch registers */ > > > > > > + if (ctx_reg != si->src_reg) { > > > > > > + *insn++ = BPF_STX_MEM(BPF_DW, si->src_reg, ctx_reg, > > > > > > + offsetof(struct xdp_buff, > > > > > > + tmp_reg[1])); > > > > > > + > > > > > > + *insn++ = BPF_MOV64_REG(ctx_reg, si->src_reg); > > > > > > + } > > > > > > + > > > > > > + *insn++ = BPF_STX_MEM(BPF_DW, ctx_reg, scratch_reg, > > > > > > + offsetof(struct xdp_buff, tmp_reg[0])); > > > > > > > > > > Why don't you push regs to stack, use it and then pop it > > > > > back? That way > > > > > I > > > > > suppose you could avoid polluting xdp_buff with tmp_reg[2]. > > > > > > > > There is no “real” stack in eBPF, only a read-only frame > > > > pointer, and as we > > > > are replacing a single instruction, we have no info on what we > > > > can use as > > > > scratch space. > > > > > > Uhm, what? You use R10 for stack operations. Verifier tracks the > > > stack > > > depth used by programs and then it is passed down to JIT so that > > > native > > > asm will create a properly sized stack frame. > > > > > > From the top of my head I would let know xdp_convert_ctx_access of a > > > current stack depth and use it for R10 stores, so your scratch space > > > would > > > be R10 + (stack depth + 8), R10 + (stack_depth + 16). > > > > Other instances do exactly the same, i.e. put some scratch registers in > > the underlying data structure, so I reused this approach. From the > > current information in the callback, I was not able to determine the > > current stack_depth. With "real" stack above, I meant having a pop/push > > like instruction. > > > > I do not know the verifier code well enough, but are you suggesting I > > can get the current stack_depth from the verifier in the > > xdp_convert_ctx_access() callback? If so any pointers? > > Maciej any feedback on the above, i.e. getting the stack_depth in > xdp_convert_ctx_access()? Sorry. I'll try to get my head around it. If i recall correctly stack depth is tracked per subprogram whereas convert_ctx_accesses is iterating through *all* insns (so a prog that is not chunked onto subprogs), but maybe we could dig up the subprog based on insn idx. But at first, you mentioned that you took the approach from other instances, can you point me to them? I'd also like to hear from Daniel/Alexei/John and others their thoughts. > > > > Problem with that would be the fact that convert_ctx_accesses() > > > happens to > > > be called after the check_max_stack_depth(), so probably stack_depth > > > of a > > > prog that has frame_length accesses would have to be adjusted > > > earlier. > > > > Ack, need to learn more on the verifier part… > > <SNIP> >