Re: [PATCH bpf-next 1/3] bpf: Propagate scalar ranges through register assignments.

Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> · Thu, 8 Oct 2020 08:53:22 -0700

On Thu, Oct 08, 2020 at 08:18:46AM -0700, John Fastabend wrote:
> > 
> > I couldn't think of any other case where scalar's ID has to be cleared.
> > Any kind of assignment and r0 return do it as well.
> 
> How about a zero extending move?
> 
>  r1 = r2 <- r1.id = r2.id
>  w1 = w1
> 
> that will narrow the bounds on r1 but r2 should not be narrowed? So
> we need to zero the r1.id I believe. But, I don't see where we
> would set r1.id = 0 in this case.

Excellent catch! Indeed. id should be cleared for 32-bit move.
Will fix.

> > 
> > Any other case you can think of ?
> 
> Still churning on the above zero extending move. Also I thought
> it was a bit odd that this wouldn't work,
> 
>  r1 = r2
>  r0 = r1
>  if r0 < 2 goto ...
> 
> then r0.id != r2.id because a new id is generated on the second
> mov there. I don't actually care that much because I can't recall
> seeing this pattern.

Right. Since it's easy to support this case I'll add it as well.
Though I also never seen llvm generate the code like this and I don't
think it will based on my understanding of regalloc.

> > I think some time in the past you've mentioned that you hit
> > exactly this greedy register alloc issue in your cilium programs.
> > Is it the case or am I misremembering?
> 
> Yes, I hit this a lot actually for whatever reason. Something
> about the code I write maybe. It also tends to be inside a loop
> so messing with volatiles doesn't help. End result is I get
> a handful of small asm blocks to force compiler into generating
> code the verifier doesn't trip up on. I was going to add I think
> the cover letter understates how much this should help.

Yeah. We also see such patterns only inside the loops with large
loop bodies, and especially in unrolled loops.
My understanding is that this is normal behavior of the greedy register
allocator that introduces register copy for the split ranges.
Yonghong sent me that link that explains algorithm in details:
http://llvm.org/devmtg/2018-04/slides/Yatsina-LLVM%20Greedy%20Register%20Allocator.pdf
The slide 137 and following slides explain exactly this scenario.

In other words there is no way to tell llvm 'not to do this',
so we have to improve the verifier smartness in such case.

I'll add these details to commit log.

> I still need to try some of Yonghong's latest patches maybe I'll
> push this patch on my stack as well and see how much asm I can
> delete.

The 2 out of 3 patches already landed. Please pull the latest llvm master.