On 8/20/24 22:36, Jens Axboe wrote:
On 8/20/24 3:10 PM, David Wei wrote:
+/*
+ * Doing min_timeout portion. If we saw any timeouts, events, or have work,
+ * wake up. If not, and we have a normal timeout, switch to that and keep
+ * sleeping.
+ */
+static enum hrtimer_restart io_cqring_min_timer_wakeup(struct hrtimer *timer)
+{
+ struct io_wait_queue *iowq = container_of(timer, struct io_wait_queue, t);
+ struct io_ring_ctx *ctx = iowq->ctx;
+
+ /* no general timeout, or shorter, we are done */
+ if (iowq->timeout == KTIME_MAX ||
+ ktime_after(iowq->min_timeout, iowq->timeout))
+ goto out_wake;
+ /* work we may need to run, wake function will see if we need to wake */
+ if (io_has_work(ctx))
+ goto out_wake;
+ /* got events since we started waiting, min timeout is done */
+ if (iowq->cq_min_tail != READ_ONCE(ctx->rings->cq.tail))
+ goto out_wake;
+ /* if we have any events and min timeout expired, we're done */
+ if (io_cqring_events(ctx))
+ goto out_wake;
How can ctx->rings->cq.tail be modified if the task is sleeping while
waiting for completions? What is doing the work?
Good question. If we have a min_timeout of <something> and a batch count
of <something>, ideally we don't want to wake the task to process when a
single completion comes in. And this is how we handle DEFER_TASKRUN, but
for anything else, the task will wake and process items. So it may have
woken up to process an item and posted a completion before this timeout
triggers. If that's the case, and min_timeout has expired (which it has
when this handler is called), then we should wake up and return.
Also, for !DEFER_TASKRUN, it can be iowq or another user thread sharing
the ring.
--
Pavel Begunkov