Re: [PATCH v5 bpf-next 00/23] BPF register bounds logic and testing improvements

Paul Chaignon <paul.chaignon@xxxxxxxxx> · Wed, 1 Nov 2023 13:37:37 +0100

On Mon, Oct 30, 2023 at 10:19:01PM -0700, Andrii Nakryiko wrote:
> On Mon, Oct 30, 2023 at 10:55 AM Alexei Starovoitov
> <alexei.starovoitov@xxxxxxxxx> wrote:
> >
> > On Fri, Oct 27, 2023 at 11:13:23AM -0700, Andrii Nakryiko wrote:
> > >
> > > Note, this is not unique to <range> vs <range> logic. Just recently ([0])
> > > a related issue was reported for existing verifier logic. This patch set does
> > > fix that issues as well, as pointed out on the mailing list.
> > >
> > >   [0] https://lore.kernel.org/bpf/CAEf4Bzbgf-WQSCz8D4Omh3zFdS4oWS6XELnE7VeoUWgKf3cpig@xxxxxxxxxxxxxx/
> >
> > Quick comment regarding shift out of bound issue.
> > I think this patch set makes Hao Sun's repro not working, but I don't think
> > the range vs range improvement fixes the underlying issue.
> 
> Correct, yes, I think adjust_reg_min_max_vals() might still need some fixing.
> 
> > Currently we do:
> > if (umax_val >= insn_bitness)
> >   mark_reg_unknown
> > else
> >   here were use src_reg->u32_max_value or src_reg->umax_value
> > I suspect the insn_bitness check is buggy and it's still possible to hit UBSAN splat with
> > out of bounds shift. Just need to try harder.
> > if w8 < 0xffffffff goto +2;
> > if r8 != r6 goto +1;
> > w0 >>= w8;
> > won't be enough anymore.
> 
> Agreed, but I felt that fixing adjust_reg_min_max_vals() is out of
> scope for this already large patch set. If someone can take a deeper
> look into reg bounds for arithmetic operations, it would be great.
> 
> On the other hand, one of those academic papers claimed to verify
> soundness of verifier's reg bounds, so I wonder why they missed this?

AFAICS, it should have been able to detect this bug. Equation (3) from
[1, page 10] encodes the soundness condition for conditional jumps and
the implementation definitely covers BPF_JEQ/JNE and the logic in
check_cond_jmp_op. So either there's a bug in the implementation or I'm
missing something about how it works. Let me cc two of the paper's
authors :)

Hari, Srinivas: Hao Sun recently discovered a bug in the range analysis
logic of the verifier, when comparing two unknown scalars with
non-overlapping ranges. See [2] for Eduard Zingerman's explanation. It
seems to have existed for a while. Any idea why Agni didn't uncover it?

1 - https://harishankarv.github.io/assets/files/agni-cav23.pdf
2 - https://lore.kernel.org/bpf/8731196c9a847ff35073a2034662d3306cea805f.camel@xxxxxxxxx/

> cc Paul, maybe he can clarify (and also, Paul, please try to run all
> that formal verification machinery against this patch set, thanks!)

I tried it yesterday but am running into what looks like a bug in the
LLVM IR to SMT conversion. Probably not something I can fix myself
quickly so I'll need help from Hari & co.

That said, even without your patchset, I'm running into another issue
where the formal verification takes several times longer (up to weeks
/o\) since v6.4.