On Mon, Jul 11, 2022 at 9:19 PM Jon Doron <arilou@xxxxxxxxx> wrote: > > > > On Tue, Jul 12, 2022, 07:01 Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> wrote: >> >> On Sun, Jul 10, 2022 at 10:07 AM Jon Doron <arilou@xxxxxxxxx> wrote: >> > >> > >> > On Sun, Jul 10, 2022, 18:16 Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: >> >> >> >> On Sat, Jul 9, 2022 at 10:43 PM Jon Doron <arilou@xxxxxxxxx> wrote: >> >> > >> >> > I was referring to the following: >> >> > https://github.com/libbpf/libbpf-rs/blob/master/libbpf-rs/src/perf_buffer.rs >> >> >> >> How does your patch help libbpf-rs? >> >> >> >> Please don't top post. >> > >> > >> > You will be able to implement a custom perf buffer consumer, as it already has good bindings with libbpf-sys which is built from the C headers >> > >> > Sorry for the top posting I'm not home and replying from my phone >> > >> >> I can see us exposing per-CPU buffers for (very) advanced users, something like: >> >> int perf_buffer__buffer(struct perf_buffer *pb, int buf_idx, void >> **buf, size_t buf_sz); > > > Not sure I'm fully following what this API does, you will get a pointer to a message in the ring buffer? > If so how do you consume without setting up a new tail? > > Or do you get a full copy of the current ring buffer (because that will mean you would have to alloc and copy which might hurt performance), but in that case you no longer a set tail or drain function. No, it returns a pointer to mmap()'ed per-CPU buffer memory, including its header page which contains head/tail positions. As I said, it's for an advanced user, you need to know the layout and how to consume data. > > Also perhaps regardless if this patchset will be approved or not it would probably be nice to have something like > int perf_buffer__state(perf_buffer__buffer(struct perf_buffer *pb, int buf_idx, size_t *free_space, size_t *used_space); > > Cheers, > --Jon. > >> >> Then in combination with perf_buffer__buffer_fd() you can implement >> your own polling and processing. So you just use libbpf logic to setup >> buffers, but then don't call perf_buffer__poll() at all and read >> records and update tail on your own. >> >> But this combination of perf_buffer__raw_ring_buf() and >> perf_buffer__set_ring_buf_tail() seems like a bad API, sorry. >> >> >> >> >> >> > Thanks, >> >> > -- Jon. >> >> > >> >> > On Sun, Jul 10, 2022, 08:23 Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: >> >> >> >> >> >> On Fri, Jul 8, 2022 at 7:54 PM Jon Doron <arilou@xxxxxxxxx> wrote: >> >> >> > >> >> >> > On 08/07/2022, Andrii Nakryiko wrote: >> >> >> > >On Thu, Jul 7, 2022 at 11:04 PM Jon Doron <arilou@xxxxxxxxx> wrote: >> >> >> > >> >> >> >> > >> From: Jon Doron <jond@xxxxxx> >> >> >> > >> >> >> >> > >> Add support for writing a custom event reader, by exposing the ring >> >> >> > >> buffer state, and allowing to set it's tail. >> >> >> > >> >> >> >> > >> Few simple examples where this type of needed: >> >> >> > >> 1. perf_event_read_simple is allocating using malloc, perhaps you want >> >> >> > >> to handle the wrap-around in some other way. >> >> >> > >> 2. Since perf buf is per-cpu then the order of the events is not >> >> >> > >> guarnteed, for example: >> >> >> > >> Given 3 events where each event has a timestamp t0 < t1 < t2, >> >> >> > >> and the events are spread on more than 1 CPU, then we can end >> >> >> > >> up with the following state in the ring buf: >> >> >> > >> CPU[0] => [t0, t2] >> >> >> > >> CPU[1] => [t1] >> >> >> > >> When you consume the events from CPU[0], you could know there is >> >> >> > >> a t1 missing, (assuming there are no drops, and your event data >> >> >> > >> contains a sequential index). >> >> >> > >> So now one can simply do the following, for CPU[0], you can store >> >> >> > >> the address of t0 and t2 in an array (without moving the tail, so >> >> >> > >> there data is not perished) then move on the CPU[1] and set the >> >> >> > >> address of t1 in the same array. >> >> >> > >> So you end up with something like: >> >> >> > >> void **arr[] = [&t0, &t1, &t2], now you can consume it orderely >> >> >> > >> and move the tails as you process in order. >> >> >> > >> 3. Assuming there are multiple CPUs and we want to start draining the >> >> >> > >> messages from them, then we can "pick" with which one to start with >> >> >> > >> according to the remaining free space in the ring buffer. >> >> >> > >> >> >> >> > > >> >> >> > >All the above use cases are sufficiently advanced that you as such an >> >> >> > >advanced user should be able to write your own perfbuf consumer code. >> >> >> > >There isn't a lot of code to set everything up, but then you get full >> >> >> > >control over all the details. >> >> >> > > >> >> >> > >I don't see this API as a generally useful, it feels way too low-level >> >> >> > >and special for inclusion in libbpf. >> >> >> > > >> >> >> > >> >> >> > Hi Andrii, >> >> >> > >> >> >> > I understand, but I was still hoping you will be willing to expose this >> >> >> > API. >> >> >> > libbpf has very simple and nice binding to Rust and other languages, >> >> >> > implementing one of those use cases in the bindings can make things much >> >> >> > simpler than using some libc or syscall APIs, instead of enjoying all >> >> >> > the simplicity that you get for free in libbpf. >> >> >> > >> >> >> > Hope you will be willing to reconsider :) >> >> >> >> >> >> The discussion would have been different if you mentioned that >> >> >> motivation in the commit logs. >> >> >> Please provide links to "Rust and other languages" code that >> >> >> uses this api.