On 8/20/24 4:14 PM, David Wei wrote: > On 2024-08-20 15:13, Pavel Begunkov wrote: >>>>>> On 8/20/24 2:08 PM, David Wei wrote: >>>>> To rephase the question, why is the original code calling >>>>> schedule_hrtimeout_range_clock() not needing to differentiate behaviour >>>>> between defer taskrun and not? >>>> >>>> Because that part is the same, the task schedules out and goes to sleep. >>>> That has always been the same regardless of how the ring is setup. Only >>>> difference is that DEFER_TASKRUN doesn't add itself to ctx->wait, and >>>> hence cannot be woken by a wake_up(ctx->wait). We have to wake the task >>>> manually. >>>> >>> >>> io_cqring_timer_wakeup() is the timer expired callback which calls >>> wake_up_process() or io_cqring_wake() depending on DEFER_TASKRUN. >>> >>> The original code calling schedule_hrtimeout_range_clock() uses >>> hrtimer_sleeper instead, which has a default timer expired callback set >>> to hrtimer_wakeup(). >>> >>> hrtimer_wakeup() only calls wake_up_process(). >>> >>> My question is: why this asymmetry? Why does the new custom callback >>> require io_cqring_wake()? >> >> That's what I'm saying, it doesn't need and doesn't really want it. >> From the correctness point of view, it's ok since we wake up a >> (unnecessarily) larger set of tasks. >> > > Yeah your explanation that came in while I was writing the email > answered it, thanks Pavel. Hah and now I see what you meant - yeah we can just remove the distinction. I didn't see anything in testing, but I also didn't have multiple tasks waiting on a ring, nor would you. So it doesn't really matter, but I'll clean it up so there's no confusion. -- Jens Axboe