On Wed, Aug 14, 2024 at 1:21 AM Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> wrote: > On Tue, Aug 13, 2024 at 1:59 PM Jann Horn <jannh@xxxxxxxxxx> wrote: > > > > On Tue, Aug 13, 2024 at 2:29 AM Andrii Nakryiko <andrii@xxxxxxxxxx> wrote: > > > Harden build ID parsing logic, adding explicit READ_ONCE() where it's > > > important to have a consistent value read and validated just once. > > > > > > Also, as pointed out by Andi Kleen, we need to make sure that entire ELF > > > note is within a page bounds, so move the overflow check up and add an > > > extra note_size boundaries validation. > > > > > > Fixes tag below points to the code that moved this code into > > > lib/buildid.c, and then subsequently was used in perf subsystem, making > > > this code exposed to perf_event_open() users in v5.12+. > > > > Sorry, I missed some things in previous review rounds: > > > > [...] > > > @@ -18,31 +18,37 @@ static int parse_build_id_buf(unsigned char *build_id, > > [...] > > > if (nhdr->n_type == BUILD_ID && > > > - nhdr->n_namesz == sizeof("GNU") && > > > - !strcmp((char *)(nhdr + 1), "GNU") && > > > - nhdr->n_descsz > 0 && > > > - nhdr->n_descsz <= BUILD_ID_SIZE_MAX) { > > > - memcpy(build_id, > > > - note_start + note_offs + > > > - ALIGN(sizeof("GNU"), 4) + sizeof(Elf32_Nhdr), > > > - nhdr->n_descsz); > > > - memset(build_id + nhdr->n_descsz, 0, > > > - BUILD_ID_SIZE_MAX - nhdr->n_descsz); > > > + name_sz == note_name_sz && > > > + strcmp((char *)(nhdr + 1), note_name) == 0 && > > > > Please change this to something like "memcmp((char *)(nhdr + 1), > > note_name, note_name_sz) == 0" to ensure that we can't run off the end > > of the page if there are no null bytes in the rest of the page. > > I did switch this to strncmp() at some earlier point, but then > realized that there is no point because note_name is controlled by us > and will ensure there is a zero at byte (note_name_sz - 1). So I don't > think memcmp() buys us anything. There are two reasons why using strcmp() here makes me uneasy. First: We're still operating on shared memory that can concurrently change. Let's say strcmp is implemented like this, this is the generic C implementation in the kernel (which I think is the implementation that's used for x86-64): int strcmp(const char *cs, const char *ct) { unsigned char c1, c2; while (1) { c1 = *cs++; c2 = *ct++; if (c1 != c2) return c1 < c2 ? -1 : 1; if (!c1) break; } return 0; } No READ_ONCE() or anything like that - it's not designed for being used on concurrently changing memory. And let's say you call it like strcmp(<shared memory>, "GNU"), and we're now in the fourth iteration. If the compiler decides to re-fetch the value of "c1" from memory for each of the two conditions, then it could be that the "if (c1 != c2)" sees c1='\0' and c2='\0', so the condition evaluates as false; but then at the "if (!c1)", the value in memory changed, and we see c1='A'. So now in the next round, we'll be accessing out-of-bounds memory behind the 4-byte string constant "GNU". So I don't think strcmp() on memory that can concurrently change is allowed. (It actually seems like the generic memcmp() is also implemented without READ_ONCE(), maybe we should change that...) Second: You are assuming that if one side of the strcmp() is at most four bytes long (including null terminator), then strcmp() also won't access more than 4 bytes of the other string, even if that string does not have a null terminator at index 4. I don't think that's part of the normal strcmp() API contract.