On 11/22/24 10:25 AM, Pavel Begunkov wrote: > On 11/22/24 17:11, Jens Axboe wrote: >> On 11/22/24 10:07 AM, Pavel Begunkov wrote: >>> On 11/22/24 16:12, Jens Axboe wrote: >>> ... >>>> static inline void io_req_local_work_add(struct io_kiocb *req, >>>> struct io_ring_ctx *ctx, >>>> - unsigned flags) >>>> + unsigned tw_flags) >>>> { >>>> - unsigned nr_wait, nr_tw, nr_tw_prev; >>>> - struct llist_node *head; >>>> + unsigned nr_tw, nr_tw_prev, nr_wait; >>>> + unsigned long flags; >>>> /* See comment above IO_CQ_WAKE_INIT */ >>>> BUILD_BUG_ON(IO_CQ_WAKE_FORCE <= IORING_MAX_CQ_ENTRIES); >>>> /* >>>> - * We don't know how many reuqests is there in the link and whether >>>> - * they can even be queued lazily, fall back to non-lazy. >>>> + * We don't know how many requests are in the link and whether they can >>>> + * even be queued lazily, fall back to non-lazy. >>>> */ >>>> if (req->flags & (REQ_F_LINK | REQ_F_HARDLINK)) >>>> - flags &= ~IOU_F_TWQ_LAZY_WAKE; >>>> + tw_flags &= ~IOU_F_TWQ_LAZY_WAKE; >>>> - guard(rcu)(); >>> >>> protects against ctx->task deallocation, see a comment in >>> io_ring_exit_work() -> synchronize_rcu() >> >> Yeah that's just an editing mistake. >> >>>> + spin_lock_irqsave(&ctx->work_lock, flags); >>>> + wq_list_add_tail(&req->io_task_work.work_node, &ctx->work_list); >>>> + nr_tw_prev = ctx->work_items++; >>> >>> Is there a good reason why it changes the semantics of >>> what's stored across adds? It was assigning a corrected >>> nr_tw, this one will start heavily spamming with wake_up() >>> in some cases. >> >> Not sure I follow, how so? nr_tw_prev will be the previous count, just >> like before. Except we won't need to dig into the list to find it, we >> have it readily available. nr_tw will be the current code, or force wake >> if needed. As before. > > The problem is what it stores, not how and where. Before req->nr_tw > could've been set to IO_CQ_WAKE_FORCE, in which case following > requests are not going to attempt waking up the task, now work_items > is just a counter. > > Let's say you've got a bunch of non-lazy adds coming close to each > other. The first sets IO_CQ_WAKE_FORCE and wakes the task, and > others just queue themselves in the list. Now, every single one > of them will try to wake_up() as long as ->cq_wait_nr is large > enough. If we really care about the non-lazy path as much, we can just use the same storing scheme as we did in req->nr_tw, except in ->work_items instead. Not a big deal imho. -- Jens Axboe