Re: [PATCH bpf 1/3] libbpf: use elf_getshdrnum() instead of e_shnum

Shung-Hsi Yu <shung-hsi.yu@xxxxxxxx> · Wed, 12 Oct 2022 09:50:33 +0800

On Tue, Oct 11, 2022 at 09:06:03AM -0700, Andrii Nakryiko wrote:
> On Tue, Oct 11, 2022 at 7:53 AM Shung-Hsi Yu <shung-hsi.yu@xxxxxxxx> wrote:
> >
> > On Tue, Oct 11, 2022 at 11:55:21AM +0800, Shung-Hsi Yu wrote:
> > > On Mon, Oct 10, 2022 at 05:44:20PM -0700, Andrii Nakryiko wrote:
> > > > On Fri, Oct 7, 2022 at 10:48 AM Shung-Hsi Yu <shung-hsi.yu@xxxxxxxx> wrote:
> > > > >
> > > > > This commit replace e_shnum with the elf_getshdrnum() helper to fix two
> > > > > oss-fuzz-reported heap-buffer overflow in __bpf_object__open. Both
> > > > > reports are incorrectly marked as fixed and while still being
> > > > > reproducible in the latest libbpf.
> > > > >
> > > > >   # clusterfuzz-testcase-minimized-bpf-object-fuzzer-5747922482888704
> > > > >   libbpf: loading object 'fuzz-object' from buffer
> > > > >   libbpf: sec_cnt is 0
> > > > >   libbpf: elf: section(1) .data, size 0, link 538976288, flags 2020202020202020, type=2
> > > > >   libbpf: elf: section(2) .data, size 32, link 538976288, flags 202020202020ff20, type=1
> > > > >   =================================================================
> > > > >   ==13==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6020000000c0 at pc 0x0000005a7b46 bp 0x7ffd12214af0 sp 0x7ffd12214ae8
> > > > >   WRITE of size 4 at 0x6020000000c0 thread T0
> > > > >   SCARINESS: 46 (4-byte-write-heap-buffer-overflow-far-from-bounds)
> > > > >       #0 0x5a7b45 in bpf_object__elf_collect /src/libbpf/src/libbpf.c:3414:24
> > > > >       #1 0x5733c0 in bpf_object_open /src/libbpf/src/libbpf.c:7223:16
> > > > >       #2 0x5739fd in bpf_object__open_mem /src/libbpf/src/libbpf.c:7263:20
> > > > >       ...
> > > > >
> > > > > The issue lie in libbpf's direct use of e_shnum field in ELF header as
> > > > > the section header count. Where as libelf, on the other hand,
> > > > > implemented an extra logic that, when e_shnum is zero and e_shoff is not
> > > > > zero, will use sh_size member of the initial section header as the real
> > > > > section header count (part of ELF spec to accommodate situation where
> > > > > section header counter is larger than SHN_LORESERVE).
> > > > >
> > > > > The above inconsistency lead to libbpf writing into a zero-entry calloc
> > > > > area. So intead of using e_shnum directly, use the elf_getshdrnum()
> > > > > helper provided by libelf to retrieve the section header counter into
> > > > > sec_cnt.
> > > > >
> > > > > Link: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=40868
> > > > > Link: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=40957
> > > > > Fixes: 0d6988e16a12 ("libbpf: Fix section counting logic")
> > > > > Fixes: 25bbbd7a444b ("libbpf: Remove assumptions about uniqueness of .rodata/.data/.bss maps")
> > > > > Signed-off-by: Shung-Hsi Yu <shung-hsi.yu@xxxxxxxx>
> > > > > ---
> > > > >
> > > > > To be honest I'm not sure if any of the BPF toolchain will produce such
> > > > > ELF binary. Tools like readelf simply refuse to dump section header
> > > > > table when e_shnum==0 && e_shoff !=0 case is encountered.
> > > > >
> > > > > While we can use same approach as readelf, opting for a coherent view
> > > > > with libelf for now since that should be less confusing.
> > > > >
> > > > > ---
> > > > >  tools/lib/bpf/libbpf.c | 10 ++++++++--
> > > > >  1 file changed, 8 insertions(+), 2 deletions(-)
> > > > >
> > > > > diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> > > > > index 184ce1684dcd..a64e13c654f3 100644
> > > > > --- a/tools/lib/bpf/libbpf.c
> > > > > +++ b/tools/lib/bpf/libbpf.c
> > > > > @@ -597,7 +597,7 @@ struct elf_state {
> > > > >         size_t shstrndx; /* section index for section name strings */
> > > > >         size_t strtabidx;
> > > > >         struct elf_sec_desc *secs;
> > > > > -       int sec_cnt;
> > > > > +       size_t sec_cnt;
> > > > >         int btf_maps_shndx;
> > > > >         __u32 btf_maps_sec_btf_id;
> > > > >         int text_shndx;
> > > > > @@ -1369,6 +1369,13 @@ static int bpf_object__elf_init(struct bpf_object *obj)
> > > > >                 goto errout;
> > > > >         }
> > > > >
> > > > > +       if (elf_getshdrnum(obj->efile.elf, &obj->efile.sec_cnt)) {
> > > >
> > > > It bothers me that sec_cnt is initialized in bpf_object__elf_init, but
> > > > secs are allocated a bit later in bpf_object__elf_collect(). What if
> > > > we move elf_getshdrnum() call and sec_cnt initialization into
> > > > bpf_object__elf_collect()?
> > >
> > > Ack.
> > >
> > > My rational for placing it there was that it's closer to other elf_*()
> > > helper calls, but having it close to the allocation where it's used seems
> > > like a better option.
> > >
> > > Will change accordingly and send a v2 based on top of bpf-next.
> > >
> > > > > +               pr_warn("elf: failed to get the number of sections for %s: %s\n",
> > > > > +                       obj->path, elf_errmsg(-1));
> > > > > +               err = -LIBBPF_ERRNO__FORMAT;
> > > > > +               goto errout;
> > > > > +       }
> > > > > +
> > > > >         /* Elf is corrupted/truncated, avoid calling elf_strptr. */
> > > > >         if (!elf_rawdata(elf_getscn(elf, obj->efile.shstrndx), NULL)) {
> > > > >                 pr_warn("elf: failed to get section names strings from %s: %s\n",
> > > > > @@ -3315,7 +3322,6 @@ static int bpf_object__elf_collect(struct bpf_object *obj)
> > > > >          * section. e_shnum does include sec #0, so e_shnum is the necessary
> > > > >          * size of an array to keep all the sections.
> > > > >          */
> > > > > -       obj->efile.sec_cnt = obj->efile.ehdr->e_shnum;
> > > > >         obj->efile.secs = calloc(obj->efile.sec_cnt, sizeof(*obj->efile.secs));
> >
> > Looking again I realized we're still allocation one more section than
> > necessary, even after 0d6988e16a12 ("libbpf: Fix section counting logic").
> 
> Yes, that's by design so to preserve ELF's 1-based indexing and not
> have to constantly adjust section index by -1 to do a lookup. Please
> keep it as is.

Understood, I'll leave it as is. Thanks!

> > elf_nextscn() skips sec #0, so (sec_cnt - 1) * sizeof(secs) should suffice.
> >
> >   /* In elfutils/libelf/elf_nextscn.c */
> >   Elf_Scn *elf_nextscn (Elf *elf, Elf_Scn *scn)
> >   {
> >     ...
> >
> >     if (scn == NULL)
> >       {
> >         /* If no section handle is given return the first (not 0th) section.
> >          Set scn to the 0th section and perform nextscn.  */
> >         if (elf->class == ELFCLASS32
> >            || (offsetof (Elf, state.elf32.scns)
> >                == offsetof (Elf, state.elf64.scns)))
> >         list = &elf->state.elf32.scns;
> >         else
> >         list = &elf->state.elf64.scns;
> >
> >         scn = &list->data[0];
> >       }
> >     ...
> >   }
> >
> > What do you think? If it make sense then I'll place the sec_cnt - 1 change
> > before the current patch unless otherwise suggested.
> >
> > > > >         if (!obj->efile.secs)
> > > > >                 return -ENOMEM;
> > > > > --
> > > > > 2.37.3
> > > > >