Re: [GIT PULL] io_uring thread worker change

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2/26/21 7:48 PM, Linus Torvalds wrote:
> Reading through this once..
> 
> I'll read through it another time tomorrow before merging it, but on
> the whole it looks good. The concept certainly is the right now, but I
> have a few questions about some of the details..

Thanks, I do think it's the right path. As mentioned there's a few minor
rough edges and cases that we've fixed since, and those are soaking as
we speak to verify that they are good. But nothing really major.

> On Fri, Feb 26, 2021 at 2:04 PM Jens Axboe <axboe@xxxxxxxxx> wrote:
>>
>>       io-wq: fix races around manager/worker creation and task exit
> 
> Where did that
> 
> +        __set_current_state(TASK_RUNNING);
> 
> in the patch come from? That looks odd.
> 
> Is it because io_wq_manager() does a
> 
>         set_current_state(TASK_INTERRUPTIBLE);
> 
> at one point? Regardless, that code looks very very strange.

Right, it's for both out of necessity for worker creation (since it
blocks), but also so that we'll run the loop again when we return.
Basically:

- Set non-runnable state
- Check condition
- schedule

and the schedule will be a no-op, so we'll do that loop again and
finally then schedule if nothing woke us up. If we don't end up needing
to fork a worker, then the stateremains TASK_INTERRUPTIBLE and we go to
sleep as originally planned.

>>       io_uring: remove the need for relying on an io-wq fallback worker
> 
> This is coded really oddly.
> 
> + do {
> +     head = NULL;
> +     work = READ_ONCE(ctx->exit_task_work);
> + } while (cmpxchg(&ctx->exit_task_work, work, head) != work);
> 
> Whaa?
> 
> If you want to write
> 
>     work = xchg(&ctx->exit_task_work, NULL);
> 
> then just do that. Instead of an insane cmpxchg loop that isn't even
> well-written.
> 
> I say that it isn't even well-written, because when you really do want
> a cmpxchg loop, then realize that cmpxchg() returns the old value, so
> the proper way to write it is
> 
>     new = whatever;
>     expect = READ_ONCE(ctx->exit_task_work);
>     for (;;) {
>           new->next = expect;  // Mumble mumble - this is why you want
> the cmpxchg
>           was = cmpxchg(&ctx->exit_task_work, expect, new);
>           if (was == expect)
>                   break;
>           expect = was;
>     }
> 
> IOW, that READ_ONCE() of the old value should happen only once - and
> isn't worth a READ_ONCE() in the first place. There's nothing "read
> once" about it.
> 
> But as mentioned, if all you want is an atomic "get and replace with
> NULL", then just a plain "xchg()" is what you should do.
> 
> Yes, that cmpxchg loop _works_, but it is really confusing to read
> that way, when clearly you don't actually need or really want a
> cmpxchg.
> 
> (The other cmpxchg loop in the same patch is needed, but does that
> unnecessary "re-read the value that cmpxchg just returned").
> 
> Please explain.

You are totally right, and Eric and Pavel actually commented on that
too. I decided to just leave it for a cleanup patch, because it should
just be an xchg(). I didn't deem it important enough to go back and fix
the original patch, since it isn't broken, it's just not as well written
or as optimal as it could be. It'll get changed to xchg().

-- 
Jens Axboe




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux