Hello, Eric. How have you been? On Fri, Feb 16, 2024 at 09:23:00AM +0100, Eric Dumazet wrote: ... > TSQ matters for high BDP, and is very time sensitive. > > Things like slow TX completions (firing from napi poll, BH context) > can hurt TSQ. > > If we add on top of these slow TX completions, an additional work > queue overhead, I really am not sure... Just to be sure, the workqueue here is executing in the same softirq context as tasklets. This isn't the usual workqueue which has to go through the scheduler. The only difference would be that workqueue does a bit more work (e.g. to manage the currenty executing hashtable) than tasklet. It's unlikely to show noticeable latency penalty in any practical case although the extra overhead would likely be visible in targeted microbenches where all that happens is scheduling and running noop work items. > I would recommend tests with pfifo_fast qdisc (not FQ which has a > special override for TSQ limits) David, do you think this is something we can do? > Eventually we could add in TCP a measure of the time lost because of > TSQ, regardless of the kick implementation (tasklet or workqueue). > Measuring the delay between when a tcp socket got tcp_wfree approval > to deliver more packets, and time it finally delivered these packets > could be implemented with a bpftrace program. I don't have enough context here but it sounds like you are worried about adding latency in that path. This conversion is unlikely to make a noticeable difference there. The interface and sementics are workqueue but the work items are being executed exactly the same way from the same softirqs as tasklets. Would testing with pfifo_fast be sufficient to dispel your concern? Thanks. -- tejun