On Mon, Sep 19, 2022 at 03:19:36PM -0500, David Vernet wrote: > > > +static void __bpf_user_ringbuf_sample_release(struct bpf_ringbuf *rb, size_t size, u64 flags) > > > +{ > > > + u64 consumer_pos; > > > + u32 rounded_size = round_up(size + BPF_RINGBUF_HDR_SZ, 8); > > > + > > > + /* Using smp_load_acquire() is unnecessary here, as the busy-bit > > > + * prevents another task from writing to consumer_pos after it was read > > > + * by this task with smp_load_acquire() in __bpf_user_ringbuf_peek(). > > > + */ > > > + consumer_pos = rb->consumer_pos; > > > + /* Synchronizes with smp_load_acquire() in user-space producer. */ > > > + smp_store_release(&rb->consumer_pos, consumer_pos + rounded_size); > > > + > > > + /* Prevent the clearing of the busy-bit from being reordered before the > > > + * storing of the updated rb->consumer_pos value. > > > + */ > > > + smp_mb__before_atomic(); > > > + atomic_set(&rb->busy, 0); > > > + > > > + if (flags & BPF_RB_FORCE_WAKEUP) > > > + irq_work_queue(&rb->work); > > > > I think this part is new, you decided to define that FORCE_WAKEUP > > sends wakeup after every single consumed sample? I have no strong > > opinion on this, tbh, just wonder if it wasn't enough to do it once > > after drain? > > I didn't have a strong reason for doing this other than that I think it > more closely matches the behavior for BPF_MAP_TYPE_RINGBUF (which invokes > irq_work_queue() after every call to bpf_ringbuf_commit() if > BPF_RB_FORCE_WAKEUP is passed). Let's just match that behavior unless we > have a good reason not to? I think that will be more intuitive for users. Hmm, something else to consider is that if we move the busy-bit setting into bpf_user_ringbuf_drain() per your suggestion below, the critical section is now the the whole sample drain loop. That's of course _not_ the case for BPF_MAP_TYPE_RINGBUF, which just holds the spinlock while reserving the sample. It seems excessive to invoke irq_work_queue() while the busy bit is held, so I think we should just have the behavior be to only have BPF_RB_FORCE_WAKEUP imply that a wakeup will always be sent, even if no sample was drained. Let me know if you disagree, but for now I'll work on spinning up a v6 that only issues the forced wakeup event once after drain. > > > +} > > > + > > > +BPF_CALL_4(bpf_user_ringbuf_drain, struct bpf_map *, map, > > > + void *, callback_fn, void *, callback_ctx, u64, flags) > > > +{ > > > + struct bpf_ringbuf *rb; > > > + long samples, discarded_samples = 0, ret = 0; > > > + bpf_callback_t callback = (bpf_callback_t)callback_fn; > > > + u64 wakeup_flags = BPF_RB_NO_WAKEUP | BPF_RB_FORCE_WAKEUP; > > > + > > > + if (unlikely(flags & ~wakeup_flags)) > > > + return -EINVAL; > > > + > > > + rb = container_of(map, struct bpf_ringbuf_map, map)->rb; > > > + for (samples = 0; samples < BPF_MAX_USER_RINGBUF_SAMPLES && ret == 0; samples++) { > > > + int err; > > > + u32 size; > > > + void *sample; > > > + struct bpf_dynptr_kern dynptr; > > > + > > > + err = __bpf_user_ringbuf_peek(rb, &sample, &size); > > > > so I also just realized that ringbuf_peek will keep setting/resetting > > busy flag, and in like all the practical case it's a completely > > useless work as we don't intend to have competing consumers, right? So > > maybe move busy bit handling into drain itself and document that peek > > expect busy taken care of? > > > > This should be noticeable faster when there are multiple records > > consumed in one drain. > > Great idea, I'll do this in v6. Thanks, David