On Wed, Aug 7, 2024 at 8:12 AM Jann Horn <jannh@xxxxxxxxxx> wrote: > > +Matthew and fsdevel list for pagecache question > > On Tue, Jul 30, 2024 at 10:39 PM Andrii Nakryiko <andrii@xxxxxxxxxx> wrote: > > Harden build ID parsing logic, adding explicit READ_ONCE() where it's > > important to have a consistent value read and validated just once. > > > > Fixes tag below points to the code that moved this code into > > lib/buildid.c, and then subsequently was used in perf subsystem, making > > this code exposed to perf_event_open() users in v5.12+. > > One thing that still seems dodgy to me with this patch applied is the > call from build_id_parse() to find_get_page(), followed by reading the > page contents. My understanding of the page cache (which might be > incorrect) is that find_get_page() can return a page whose contents > have not been initialized yet, and you're supposed to check for > PageUptodate() or something like that before reading from it. > > Maybe Matthew can check if I understood that right? > > > Also, it might be a good idea to liberally spray READ_ONCE() around > all the remaining unannotated shared memory accesses in > build_id_parse(), get_build_id_32(), get_build_id_64() and > parse_build_id_buf(). Andi was against that, so I kept READ_ONCE() only where strictly necessary, AFAICT. > > > Cc: stable@xxxxxxxxxxxxxxx > > Cc: Jann Horn <jannh@xxxxxxxxxx> > > Suggested-by: Andi Kleen <ak@xxxxxxxxxxxxxxx> > > Fixes: bd7525dacd7e ("bpf: Move stack_map_get_build_id into lib") > > Signed-off-by: Andrii Nakryiko <andrii@xxxxxxxxxx> > > --- > > lib/buildid.c | 51 +++++++++++++++++++++++++++------------------------ > > 1 file changed, 27 insertions(+), 24 deletions(-) > > > > diff --git a/lib/buildid.c b/lib/buildid.c > > index e02b5507418b..d21d86f6c19a 100644 > > --- a/lib/buildid.c > > +++ b/lib/buildid.c > > @@ -18,28 +18,29 @@ static int parse_build_id_buf(unsigned char *build_id, > > const void *note_start, > > Elf32_Word note_size) > > { > > + const char note_name[] = "GNU"; > > + const size_t note_name_sz = sizeof(note_name); > > Elf32_Word note_offs = 0, new_offs; > > + u32 name_sz, desc_sz; > > + const char *data; > > > > while (note_offs + sizeof(Elf32_Nhdr) < note_size) { > > Elf32_Nhdr *nhdr = (Elf32_Nhdr *)(note_start + note_offs); > > > > + name_sz = READ_ONCE(nhdr->n_namesz); > > + desc_sz = READ_ONCE(nhdr->n_descsz); > > if (nhdr->n_type == BUILD_ID && > > - nhdr->n_namesz == sizeof("GNU") && > > - !strcmp((char *)(nhdr + 1), "GNU") && > > - nhdr->n_descsz > 0 && > > - nhdr->n_descsz <= BUILD_ID_SIZE_MAX) { > > - memcpy(build_id, > > - note_start + note_offs + > > - ALIGN(sizeof("GNU"), 4) + sizeof(Elf32_Nhdr), > > - nhdr->n_descsz); > > - memset(build_id + nhdr->n_descsz, 0, > > - BUILD_ID_SIZE_MAX - nhdr->n_descsz); > > + name_sz == note_name_sz && > > + strcmp((char *)(nhdr + 1), note_name) == 0 && > > + desc_sz > 0 && desc_sz <= BUILD_ID_SIZE_MAX) { > > + data = note_start + note_offs + ALIGN(note_name_sz, 4); > > I don't think we have any guarantee here that this addition won't > result in an OOB pointer? > > > + memcpy(build_id, data, desc_sz); > > I think this can access OOB data (because "data" can already be OOB > and because "desc_sz" hasn't been checked against the amount of > remaining space in the page). Andi already pointed this out and I fixed it locally, thanks. > > > + memset(build_id + desc_sz, 0, BUILD_ID_SIZE_MAX - desc_sz); > > if (size) > > - *size = nhdr->n_descsz; > > + *size = desc_sz; > > return 0; > > } > > - new_offs = note_offs + sizeof(Elf32_Nhdr) + > > - ALIGN(nhdr->n_namesz, 4) + ALIGN(nhdr->n_descsz, 4); > > + new_offs = note_offs + sizeof(Elf32_Nhdr) + ALIGN(name_sz, 4) + ALIGN(desc_sz, 4); > > if (new_offs <= note_offs) /* overflow */ > > break; > > You check whether "new_offs" has wrapped here, but then on the next > loop iteration, you check for "note_offs + sizeof(Elf32_Nhdr) < > note_size". So if new_offs is 0xffffffff at this point, then I think > the overflow check here will be passed, the loop condition will be > true on 32-bit kernels (on 64-bit kernels it won't be because the > addition happens on 64-bit numbers thanks to sizeof()), and "nhdr" > will point in front of the note? Correct, and so I moved this new_offs calculation and overflow check to the beginning of the loop, which I think should capture this issue. For the while() condition itself I have: if (check_add_overflow(note_offs, note_size, ¬e_end)) return -EINVAL; while (note_offs < note_end - sizeof(Elf32_Nhdr) - note_name_sz) { ... } I'll try to post an updated version soon-ish, have been waiting for mm folks feedback before posting a new version. > > > note_offs = new_offs;