Re: [PATCH 3/4] bpf, docs: Use consistent names for the same field

Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> · Fri, 21 Oct 2022 12:01:24 -0700

On Fri, Oct 21, 2022 at 10:56 AM Dave Thaler <dthaler@xxxxxxxxxxxxx> wrote:
>
> > On Wed, Oct 19, 2022 at 4:35 PM Stanislav Fomichev <sdf@xxxxxxxxxx>
> > wrote:
> > > On Wed, Oct 19, 2022 at 2:06 PM Dave Thaler <dthaler@xxxxxxxxxxxxx>
> > wrote:
> > > >
> > > > sdf@xxxxxxxxxx wrote:
> > > > > >   ``BPF_ADD | BPF_X | BPF_ALU`` means::
> > > > >
> > > > > > -  dst_reg = (u32) dst_reg + (u32) src_reg;
> > > > > > +  dst = (u32) (dst + src)
> > > > >
> > > > > IIUC, by going from (u32) + (u32) to (u32)(), we want to signal
> > > > > that the value will just wrap around?
> > > >
> > > > Right.  In particular the old line could be confusing if one
> > > > misinterpreted it as saying that the addition could overflow into a
> > > > higher bit.  The new line is intended to be unambiguous that the upper 32
> > bits are 0.
> > > >
> > > > > But isn't it more confusing now because it's unclear what the sign
> > > > > of the dst/src is (s32 vs u32)?
> > > >
> > > > As stated the upper 32 bits have to be 0, just as any other u32 assignment.
> > >
> > > Do we mention somewhere above/below that the operands are unsigned?
> > > IOW, what prevents me from reading this new format as follows?
> > >
> > > dst = (u32) ((s32)dst + (s32)src)
> >
> > The doc mentions it, but I completely agree with you.
> > The original line was better.
> > Dave, please undo this part.
>
> Nothing prevents you from reading the new format as
>     dst = (u32) ((s32)dst + (s32)src)
> because that implementation wouldn't be wrong.
>
> Below is why, please point out any logic errors if you see any.
>
> Mathematically, all of the following have identical results:
>     dst = (u32) ((s32)dst + (s32)src)
>     dst = (u32) ((u32)dst + (u32)src)
>     dst = (u32) ((s32)dst + (u32)src)
>     dst = (u32) ((u32)dst + (s32)src)
>
> u32 and s32, once you allow overflow/underflow to wrap within 32 bits, are
> mathematical rings (see https://en.wikipedia.org/wiki/Ring_(mathematics) )
> meaning they're a circular space where X, X + 2^32, and X - 2^32 are equal.
> So (s32)src == (u32)src when the most significant bit is clear, and
> (s32)src == (u32)src - 2^32 when the most significant bit is set.
>
> So the sign of the addition operands does not matter here.
> What matters is whether you do addition where the result can be
> more than 32 bits or not, which is what the new line makes unambiguous
> and the old line did not.
>
> Specifically, nothing prevented mis-interpreting the old line as
>
> u64 temp = (u32)dst;
> temp += (u32)src;
> dst = temp;

Well dst_reg = (u32) dst_reg + (u32) src_reg
implies C semantics, so it cannot be misinterpreted that way.

> which would give the wrong answer since the upper 32-bits might be non-zero.
>
> u64 temp = (s32)dst;
> temp += (s32)src;
> dst = (u32)temp;
>
> Would however give the correct answer, same as
>
> u64 temp = (u32)dst;
> temp += (u32)src;
> dst = (u32)temp;
>
> As such, I maintain the old line was bad and the new line is still good.

dst_reg = (u32) (dst_reg + src_reg)
implies that the operation is performed in 64-bit and then
the result is truncated to 32-bit which is not correct.

If we had traditional carry, sign, overflow flags in bpf ISA
the bit-ness of operation would be significant.
Thankfully we don't, so it's not a big deal.

but let's do full verbose to avoid describing C semantics:
dst = (u32) ((u32)dst + (u32)src)