Re: [PATCH 2/6] io_uring: replace defer task_work llist with io_wq_work_list

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/22/24 10:25 AM, Pavel Begunkov wrote:
> On 11/22/24 17:11, Jens Axboe wrote:
>> On 11/22/24 10:07 AM, Pavel Begunkov wrote:
>>> On 11/22/24 16:12, Jens Axboe wrote:
>>> ...
>>>>    static inline void io_req_local_work_add(struct io_kiocb *req,
>>>>                         struct io_ring_ctx *ctx,
>>>> -                     unsigned flags)
>>>> +                     unsigned tw_flags)
>>>>    {
>>>> -    unsigned nr_wait, nr_tw, nr_tw_prev;
>>>> -    struct llist_node *head;
>>>> +    unsigned nr_tw, nr_tw_prev, nr_wait;
>>>> +    unsigned long flags;
>>>>          /* See comment above IO_CQ_WAKE_INIT */
>>>>        BUILD_BUG_ON(IO_CQ_WAKE_FORCE <= IORING_MAX_CQ_ENTRIES);
>>>>          /*
>>>> -     * We don't know how many reuqests is there in the link and whether
>>>> -     * they can even be queued lazily, fall back to non-lazy.
>>>> +     * We don't know how many requests are in the link and whether they can
>>>> +     * even be queued lazily, fall back to non-lazy.
>>>>         */
>>>>        if (req->flags & (REQ_F_LINK | REQ_F_HARDLINK))
>>>> -        flags &= ~IOU_F_TWQ_LAZY_WAKE;
>>>> +        tw_flags &= ~IOU_F_TWQ_LAZY_WAKE;
>>>>    -    guard(rcu)();
>>>
>>> protects against ctx->task deallocation, see a comment in
>>> io_ring_exit_work() -> synchronize_rcu()
>>
>> Yeah that's just an editing mistake.
>>
>>>> +    spin_lock_irqsave(&ctx->work_lock, flags);
>>>> +    wq_list_add_tail(&req->io_task_work.work_node, &ctx->work_list);
>>>> +    nr_tw_prev = ctx->work_items++;
>>>
>>> Is there a good reason why it changes the semantics of
>>> what's stored across adds? It was assigning a corrected
>>> nr_tw, this one will start heavily spamming with wake_up()
>>> in some cases.
>>
>> Not sure I follow, how so? nr_tw_prev will be the previous count, just
>> like before. Except we won't need to dig into the list to find it, we
>> have it readily available. nr_tw will be the current code, or force wake
>> if needed. As before.
> 
> The problem is what it stores, not how and where. Before req->nr_tw
> could've been set to IO_CQ_WAKE_FORCE, in which case following
> requests are not going to attempt waking up the task, now work_items
> is just a counter.
> 
> Let's say you've got a bunch of non-lazy adds coming close to each
> other. The first sets IO_CQ_WAKE_FORCE and wakes the task, and
> others just queue themselves in the list. Now, every single one
> of them will try to wake_up() as long as ->cq_wait_nr is large
> enough.

If we really care about the non-lazy path as much, we can just use the
same storing scheme as we did in req->nr_tw, except in ->work_items
instead. Not a big deal imho.

-- 
Jens Axboe




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux