On Wed, Aug 14, 2024 at 9:14 AM Jann Horn <jannh@xxxxxxxxxx> wrote: > > On Wed, Aug 14, 2024 at 1:21 AM Andrii Nakryiko > <andrii.nakryiko@xxxxxxxxx> wrote: > > On Tue, Aug 13, 2024 at 1:59 PM Jann Horn <jannh@xxxxxxxxxx> wrote: > > > > > > On Tue, Aug 13, 2024 at 2:29 AM Andrii Nakryiko <andrii@xxxxxxxxxx> wrote: > > > > Harden build ID parsing logic, adding explicit READ_ONCE() where it's > > > > important to have a consistent value read and validated just once. > > > > > > > > Also, as pointed out by Andi Kleen, we need to make sure that entire ELF > > > > note is within a page bounds, so move the overflow check up and add an > > > > extra note_size boundaries validation. > > > > > > > > Fixes tag below points to the code that moved this code into > > > > lib/buildid.c, and then subsequently was used in perf subsystem, making > > > > this code exposed to perf_event_open() users in v5.12+. > > > > > > Sorry, I missed some things in previous review rounds: > > > > > > [...] > > > > @@ -18,31 +18,37 @@ static int parse_build_id_buf(unsigned char *build_id, > > > [...] > > > > if (nhdr->n_type == BUILD_ID && > > > > - nhdr->n_namesz == sizeof("GNU") && > > > > - !strcmp((char *)(nhdr + 1), "GNU") && > > > > - nhdr->n_descsz > 0 && > > > > - nhdr->n_descsz <= BUILD_ID_SIZE_MAX) { > > > > - memcpy(build_id, > > > > - note_start + note_offs + > > > > - ALIGN(sizeof("GNU"), 4) + sizeof(Elf32_Nhdr), > > > > - nhdr->n_descsz); > > > > - memset(build_id + nhdr->n_descsz, 0, > > > > - BUILD_ID_SIZE_MAX - nhdr->n_descsz); > > > > + name_sz == note_name_sz && > > > > + strcmp((char *)(nhdr + 1), note_name) == 0 && > > > > > > Please change this to something like "memcmp((char *)(nhdr + 1), > > > note_name, note_name_sz) == 0" to ensure that we can't run off the end > > > of the page if there are no null bytes in the rest of the page. > > > > I did switch this to strncmp() at some earlier point, but then > > realized that there is no point because note_name is controlled by us > > and will ensure there is a zero at byte (note_name_sz - 1). So I don't > > think memcmp() buys us anything. > > There are two reasons why using strcmp() here makes me uneasy. > > > First: We're still operating on shared memory that can concurrently change. > > Let's say strcmp is implemented like this, this is the generic C > implementation in the kernel (which I think is the implementation > that's used for x86-64): > > int strcmp(const char *cs, const char *ct) > { > unsigned char c1, c2; > > while (1) { > c1 = *cs++; > c2 = *ct++; > if (c1 != c2) > return c1 < c2 ? -1 : 1; > if (!c1) > break; > } > return 0; > } > > No READ_ONCE() or anything like that - it's not designed for being > used on concurrently changing memory. > > And let's say you call it like strcmp(<shared memory>, "GNU"), and > we're now in the fourth iteration. If the compiler decides to re-fetch > the value of "c1" from memory for each of the two conditions, then it > could be that the "if (c1 != c2)" sees c1='\0' and c2='\0', so the > condition evaluates as false; but then at the "if (!c1)", the value in > memory changed, and we see c1='A'. So now in the next round, we'll be > accessing out-of-bounds memory behind the 4-byte string constant > "GNU". > > So I don't think strcmp() on memory that can concurrently change is allowed. > > (It actually seems like the generic memcmp() is also implemented > without READ_ONCE(), maybe we should change that...) > > > Second: You are assuming that if one side of the strcmp() is at most > four bytes long (including null terminator), then strcmp() also won't > access more than 4 bytes of the other string, even if that string does > not have a null terminator at index 4. I don't think that's part of > the normal strcmp() API contract. Ok, I'm convinced, all fair points. I'll switch to memcmp(), there is no downside to that anyways.