Re: [PATCH 3/5] io_uring: implement our own schedule timeout handling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 8/20/24 4:51 PM, Jens Axboe wrote:
> On 8/20/24 4:19 PM, Jens Axboe wrote:
>> On 8/20/24 4:14 PM, David Wei wrote:
>>> On 2024-08-20 15:13, Pavel Begunkov wrote:
>>>>>>>> On 8/20/24 2:08 PM, David Wei wrote:
>>>>>>> To rephase the question, why is the original code calling
>>>>>>> schedule_hrtimeout_range_clock() not needing to differentiate behaviour
>>>>>>> between defer taskrun and not?
>>>>>>
>>>>>> Because that part is the same, the task schedules out and goes to sleep.
>>>>>> That has always been the same regardless of how the ring is setup. Only
>>>>>> difference is that DEFER_TASKRUN doesn't add itself to ctx->wait, and
>>>>>> hence cannot be woken by a wake_up(ctx->wait). We have to wake the task
>>>>>> manually.
>>>>>>
>>>>>
>>>>> io_cqring_timer_wakeup() is the timer expired callback which calls
>>>>> wake_up_process() or io_cqring_wake() depending on DEFER_TASKRUN.
>>>>>
>>>>> The original code calling schedule_hrtimeout_range_clock() uses
>>>>> hrtimer_sleeper instead, which has a default timer expired callback set
>>>>> to hrtimer_wakeup().
>>>>>
>>>>> hrtimer_wakeup() only calls wake_up_process().
>>>>>
>>>>> My question is: why this asymmetry? Why does the new custom callback
>>>>> require io_cqring_wake()?
>>>>
>>>> That's what I'm saying, it doesn't need and doesn't really want it.
>>>> From the correctness point of view, it's ok since we wake up a
>>>> (unnecessarily) larger set of tasks.
>>>>
>>>
>>> Yeah your explanation that came in while I was writing the email
>>> answered it, thanks Pavel.
>>
>> Hah and now I see what you meant - yeah we can just remove the
>> distinction. I didn't see anything in testing, but I also didn't have
>> multiple tasks waiting on a ring, nor would you. So it doesn't really
>> matter, but I'll clean it up so there's no confusion.
> 
> Actually probably better to just leave it as-is, as we'd otherwise need
> to store a task in io_wait_queue. Which we of course could, and would
> remove a branch in there...

I guess I should actually look at the code first, we have it via
wq->private already. Hence:


diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index ddfbe04c61ed..4ba5292137c3 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -2353,13 +2353,9 @@ static bool current_pending_io(void)
 static enum hrtimer_restart io_cqring_timer_wakeup(struct hrtimer *timer)
 {
 	struct io_wait_queue *iowq = container_of(timer, struct io_wait_queue, t);
-	struct io_ring_ctx *ctx = iowq->ctx;
 
 	WRITE_ONCE(iowq->hit_timeout, 1);
-	if (ctx->flags & IORING_SETUP_DEFER_TASKRUN)
-		wake_up_process(ctx->submitter_task);
-	else
-		io_cqring_wake(ctx);
+	wake_up_process(iowq->wq.private);
 	return HRTIMER_NORESTART;
 }
 

-- 
Jens Axboe





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux