Re: [PATCH bpf-next 1/3] bpf: Propagate scalar ranges through register assignments.

Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> · Wed, 7 Oct 2020 18:45:53 -0700

On Wed, Oct 07, 2020 at 04:55:24PM -0700, John Fastabend wrote:
> John Fastabend wrote:
> > Alexei Starovoitov wrote:
> > > From: Alexei Starovoitov <ast@xxxxxxxxxx>
> > > 
> > > The llvm register allocator may use two different registers representing the
> > > same virtual register. In such case the following pattern can be observed:
> > > 1047: (bf) r9 = r6
> > > 1048: (a5) if r6 < 0x1000 goto pc+1
> > > 1050: ...
> > > 1051: (a5) if r9 < 0x2 goto pc+66
> > > 1052: ...
> > > 1053: (bf) r2 = r9 /* r2 needs to have upper and lower bounds */
> > > 
> > > In order to track this information without backtracking allocate ID
> > > for scalars in a similar way as it's done for find_good_pkt_pointers().
> > > 
> > > When the verifier encounters r9 = r6 assignment it will assign the same ID
> > > to both registers. Later if either register range is narrowed via conditional
> > > jump propagate the register state into the other register.
> > > 
> > > Clear register ID in adjust_reg_min_max_vals() for any alu instruction.
> > 
> > Do we also need to clear the register ID on reg0 for CALL ops into a
> > helper?

Thank you for asking all those questions. Much appreciate it!

> > 
> > Looks like check_helper_call might mark reg0 as a scalar, but I don't
> > see where it would clear the reg->id? Did I miss it. Either way maybe
> > a comment here would help make it obvious how CALLs are handled?
> > 
> > Thanks,
> > John
> 
> OK sorry for the noise found it right after hitting send. Any call to
> mark_reg_unknown will zero the id.

Right. The verifier uses mark_reg_unknown() in lots of places,
so I figured it doesn't make sense to list them all.

> 
> /* Mark a register as having a completely unknown (scalar) value. */
> static void __mark_reg_unknown(const struct bpf_verifier_env *env,
> 			       struct bpf_reg_state *reg)
> {
> 	/*
> 	 * Clear type, id, off, and union(map_ptr, range) and
> 	 * padding between 'type' and union
> 	 */
> 	memset(reg, 0, offsetof(struct bpf_reg_state, var_off));

Excatly and the comment mentions 'id' too.

> 
> And check_helper_call() does,
> 
> 	/* update return register (already marked as written above) */
> 	if (fn->ret_type == RET_INTEGER) {
> 		/* sets type to SCALAR_VALUE */
> 		mark_reg_unknown(env, regs, BPF_REG_0);
> 
> so looks good to me. In the check_func_call() case the if is_global
> branch will mark_reg_unknown(). The other case only seems to do a
> clear_caller_saved_regs though. Is that enough?

clear_caller_saved_regs() -> mark_reg_not_init() -> __mark_reg_unknown().

I couldn't think of any other case where scalar's ID has to be cleared.
Any kind of assignment and r0 return do it as well.

We can clear id in r6 - r10 when we call a helper, but that's a bit
paranoid, since the registers are still valid and still equal.
Like:
r6 = r7
call foo
// after the call
if r6 > 5 goto
if r7 < 2 goto
// here both r6 and r7 will have bounds

I think it's good for the verifier to support that.

The other case with calls:

r1 = r2
call foo
  // and now inside the callee
  if r1 > 5 goto
  if r2 < 2 goto
  // here both r1 and r2 will have bounds

This case will also work.

Both cases are artificial and the verifier doesn't have to be that
smart, but it doesn't hurt and I don't think it's worth to restrict.

I'll add two synthetic tests for these cases.

Any other case you can think of ?
I think some time in the past you've mentioned that you hit
exactly this greedy register alloc issue in your cilium programs.
Is it the case or am I misremembering?