On Tue, 5 May 2020 at 16:28, Maxim Mikityanskiy <maximmi@xxxxxxxxxxxx> wrote: > [...] > > > > - if (((d->addr + d->len) & q->chunk_mask) != (d->addr & q->chunk_mask) || > > - d->options) { > > +static inline bool xskq_cons_is_valid_desc(struct xsk_queue *q, > > + struct xdp_desc *d, > > + struct xdp_umem *umem) > > +{ > > + if (!xp_validate_desc(umem->pool, d)) { > > I did some performance debugging and came to conclusion that this > function call is the culprit of the TX speed degradation that I > experience. I still don't know if it's the only reason or not, but I > clearly see a degradation when xskq_cons_is_valid_desc is not fully > inlined, but calls a function. E.g., I've put the code that handles the > aligned mode into a separate function in a different file, and it caused > the similar speed decrease. > Thanks for looking in to, and finding this! I'll make sure the xp_validate_desc() call is inlined for the next revision. A note to myself: I need to check the performance for an LLVM build with LTO enabled. Cheers, Björn