On Thu, Nov 05, 2020 at 08:19:47PM -0800, Alexei Starovoitov wrote: > > Subject: [PATCH] Update perf ring buffer to prevent corruption from > > bpf_perf_output_event() $Subject is broken, it lacks subsystem prefix. > > > > The bpf_perf_output_event() helper takes a sample size parameter of u64, but > > the underlying perf ring buffer uses a u16 internally. This 64KB maximum size > > has to also accommodate a variable sized header. Failure to observe this > > restriction can result in corruption of the perf ring buffer as samples > > overlap. > > > > Track the sample size and return -E2BIG if too big to fit into the u16 > > size parameter. > > > > Signed-off-by: Kevin Sheldrake <kevin.sheldrake@xxxxxxxxxxxxx> > > --- > > include/linux/perf_event.h | 2 +- > > kernel/events/core.c | 40 ++++++++++++++++++++++++++-------------- > > 2 files changed, 27 insertions(+), 15 deletions(-) > > > > diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h > > index 0c19d27..b9802e5 100644 > > --- a/include/linux/perf_event.h > > +++ b/include/linux/perf_event.h > > @@ -1060,7 +1060,7 @@ extern void perf_output_sample(struct perf_output_handle *handle, > > struct perf_event_header *header, > > struct perf_sample_data *data, > > struct perf_event *event); > > -extern void perf_prepare_sample(struct perf_event_header *header, > > +extern int perf_prepare_sample(struct perf_event_header *header, > > struct perf_sample_data *data, > > struct perf_event *event, > > struct pt_regs *regs); > > diff --git a/kernel/events/core.c b/kernel/events/core.c > > index da467e1..c6c4a3c 100644 > > --- a/kernel/events/core.c > > +++ b/kernel/events/core.c > > @@ -7016,15 +7016,17 @@ perf_callchain(struct perf_event *event, struct pt_regs *regs) > > return callchain ?: &__empty_callchain; > > } > > > > -void perf_prepare_sample(struct perf_event_header *header, > > +int perf_prepare_sample(struct perf_event_header *header, > > struct perf_sample_data *data, > > struct perf_event *event, > > struct pt_regs *regs) please re-align things. > > { > > u64 sample_type = event->attr.sample_type; > > + u32 header_size = header->size; > > + > > > > header->type = PERF_RECORD_SAMPLE; > > - header->size = sizeof(*header) + event->header_size; > > + header_size = sizeof(*header) + event->header_size; > > > > header->misc = 0; > > header->misc |= perf_misc_flags(regs); > > @@ -7042,7 +7044,7 @@ void perf_prepare_sample(struct perf_event_header *header, > > > > size += data->callchain->nr; > > > > - header->size += size * sizeof(u64); > > + header_size += size * sizeof(u64); > > } > > > > if (sample_type & PERF_SAMPLE_RAW) { > > @@ -7067,7 +7069,7 @@ void perf_prepare_sample(struct perf_event_header *header, > > size = sizeof(u64); > > } > > > > - header->size += size; > > + header_size += size; > > } AFAICT perf_raw_frag::size is a u32, so the above addition can already fully overflow. Best is probably to make header_size u64 and delay that until the final tally below. > > > > if (sample_type & PERF_SAMPLE_BRANCH_STACK) { > > @@ -7162,14 +7164,20 @@ void perf_prepare_sample(struct perf_event_header *header, > > * Make sure this doesn't happen by using up to U16_MAX bytes > > * per sample in total (rounded down to 8 byte boundary). > > */ > > - size = min_t(size_t, U16_MAX - header->size, > > + size = min_t(size_t, U16_MAX - header_size, > > event->attr.aux_sample_size); > > size = rounddown(size, 8); > > size = perf_prepare_sample_aux(event, data, size); > > > > - WARN_ON_ONCE(size + header->size > U16_MAX); > > - header->size += size; > > + WARN_ON_ONCE(size + header_size > U16_MAX); > > + header_size += size; > > } > > + > > + if (header_size > U16_MAX) > > + return -E2BIG; > > + > > + header->size = header_size; > > + > > /* > > * If you're adding more sample types here, you likely need to do > > * something about the overflowing header::size, like repurpose the > > @@ -7179,6 +7187,8 @@ void perf_prepare_sample(struct perf_event_header *header, > > * do here next. > > */ > > WARN_ON_ONCE(header->size & 7); > > + > > + return 0; > > } > > > > static __always_inline int > > @@ -7196,7 +7206,9 @@ __perf_event_output(struct perf_event *event, > > /* protect the callchain buffers */ > > rcu_read_lock(); > > > > - perf_prepare_sample(&header, data, event, regs); > > + err = perf_prepare_sample(&header, data, event, regs); > > + if (err) > > + goto exit; This is wrong I think. The thing is that when output_begin() below returns an error, there either is no buffer (in which case we can't do anything much at all) or it will have incremented rb->lost. This OTOH will completely fail to report the loss. The error case here is to immediately try and emit a RECORD_LOST event, but then please also consider these patches: https://lkml.kernel.org/r/20201030151345.540479897@xxxxxxxxxxxxx (which I'll be pushing into tip/perf/urgent soonish) > > > > err = output_begin(&handle, event, header.size); > > if (err) > > -- > > 2.7.4 > >