Re: [PATCH RFC v1 1/3] bpf: move from sha1 to blake2s in tag calculation

"Jason A. Donenfeld" <Jason@xxxxxxxxx> · Fri, 14 Jan 2022 15:12:37 +0100

Hi Alexei,

On Thu, Jan 13, 2022 at 11:45 PM Alexei Starovoitov
<alexei.starovoitov@xxxxxxxxx> wrote:
> On Thu, Jan 13, 2022 at 4:27 AM Jason A. Donenfeld <Jason@xxxxxxxxx> wrote:
> >
> > Hi Alexei,
> >
> > On 1/13/22, Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote:
> > > Nack.
> > > It's part of api. We cannot change it.
> >
> > This is an RFC patchset, so there's no chance that it'll actually be
> > applied as-is, and hence there's no need for the strong hammer nack.
> > The point of "request for comments" is comments. Specifically here,
> > I'm searching for information on the ins and outs of *why* it might be
> > hard to change. How does userspace use this? Why must this 64-bit
> > number be unchanged? Why did you do things this way originally? Etc.
> > If you could provide a bit of background, we might be able to shake
> > out a solution somewhere in there.
>
> There is no problem with the code and nothing to be fixed.

Yes yes, my mama says I'm the specialist snowflake of a boy too. That
makes two of us ice crystals, falling from the winter heavens,
blessing vim with our beautiful shapes and frosty code.

Anyway, back to reality, as Geert points out, we're hoping to be able
to remove lib/sha1.c from vmlinux (see 3/3 of this series) for
codesize, and this bpf usage here is one of two remaining usages of
it. So I was hoping that by sending this RFC, it might elicit a bit
more information about the ecosystem around the usage of the function,
so that we can start trying to think of creative solutions to sunset
it.

I started trying to figure out what's up there and wound up with some
more questions. My primary one is why you're okay with such a weak
"checksum" -- the thing is only 64-bits, and as you told Andy Polyakov
in 2016 when he tried to stop you from using SHA-1, "Andy, please read
the code. \ we could have used jhash there just as well. \ Collisions
are fine."

Looking at https://github.com/iovisor/bcc/blob/e17c4f7324d8fc5cc24ba8ee1db451666cd7ced3/src/cc/bpf_module.cc#L571
I see:

  err = bpf_prog_compute_tag(insns, prog_len, &tag1);
  if (err)
    return err;
  err = bpf_prog_get_tag(prog_fd, &tag2);
  if (err)
    return err;
  if (tag1 != tag2) {
    fprintf(stderr, "prog tag mismatch %llx %llx\n", tag1, tag2);

So it's clearly a check for something. A collision there might prove pesky:

  char buf[128];
  ::snprintf(buf, sizeof(buf), BCC_PROG_TAG_DIR "/bpf_prog_%llx", tag1);
  err = mkdir(buf, 0777);

Maybe you don't really see a security problem here, because these
programs are root loadable anyway? But I imagine things will
ultimately get more complicated later on down the road when bpf
becomes more modular and less privileged and more namespaced -- the
usual evolution of these sorts of features.

So I'm wondering - why not just do this in a more robust way entirely,
and always export a sufficiently sized blake2s hash? That way we'll
never have these sorts of shenanigans to care about. If that's not a
sensible thing to do, it's likely that I _still_ don't quite grok the
purpose of the program tag, in which case, I'd be all ears to an
explanation.

Jason

[ PS: As an aside, I noticed some things in the userspace tag
calculation code at
https://github.com/iovisor/bcc/blob/aa7200b9b2a7a2ce2e8a6f0dc1f456f3f93af1da/src/cc/libbpf.c#L536
- you probably shouldn't use AF_ALG for things that are software based
and can be done in userspace faster. And the unconditional
__builtin_bswap64 there means that the code will fail on big endian
systems. I know you mostly only care about x86 and all, but <endian.h>
might make this easy to fix. ]