Re: [PATCH 09/18] io-wq: fork worker threads from original task

Jens Axboe <axboe@xxxxxxxxx> · Thu, 4 Mar 2021 12:19:00 -0700

On 3/4/21 11:56 AM, Linus Torvalds wrote:
> On Thu, Mar 4, 2021 at 10:19 AM Jens Axboe <axboe@xxxxxxxxx> wrote:
>>
>> How about this - it moves the signal fiddling into the task
>> itself, and leaves the parent alone. Also allows future cleanups
>> of how we wait for thread creation.
> 
> Ugh, I think this is wrong.
> 
> You shouldn't usekernel_thread() at all, and you shouldn't need to set
> the sigmask in the parent, only to have it copied to the child, and
> then restore it in the parent.
> 
> You shouldn't have to have that silly extra scheduling rendezvous with
> the completion, which forces two schedules (first a schedule to the
> child to do what it wants to do, and then "complete()" there to wake
> up the parent that is waiting for the completion.
> 
> The thing is, our internal thread creation functionality is already
> written explicitly to not need any of this: the creation of a new
> thread is a separate phase, and then you do some setup, and then you
> actually tell the new thread "ok, go go go".
> 
> See the kernel_clone() function kernel/fork.c for the structure of this all.
> 
> You really should just do
> 
>  (a) copy_thread() to create a new child that is inactive and cannot yet run
> 
>  (b) do any setup in that new child (like setting the signal mask in
> it, but also perhaps setting the PF_IO_WORKER flag etc)
> 
>  (c) actually say "go go go": wake_up_new_task(p);
> 
> and you're done. No completions, no "set temporary mask in parent to
> be copied", no nothing like that.
> 
> And for the IO worker threads, you really don't want all the other
> stuff that kernel_clone() does. You don't want the magic VFORK "wait
> for the child to release the VM we gave it". You don't want the clone
> ptrace setup, because you can't ptrace those IO workler threads
> anyway. You might want a tracepoint, but you probably want a
> _different_ tracepoint than the "normal clone" one. You don't want the
> latent entropy addition, because honestly, the thing has no entropy to
> add either.
> 
> So I think you really want to just add a new "create_io_thread()"
> inside kernel/fork.c, which is a very cut-down and specialized version
> of kernel_clone().
> 
> It's actually going to be _less_ code than what you have now, and it's
> going to avoid all the problems with anmy half-way state or "set
> parent state to something that gets copied and then undo the parent
> state after the copy".

Took a quick look at this, and I agree that's _much_ better. In fact, it
boils down to just calling copy_process() and then having the caller do
wake_up_new_task(). So not sure if it's worth adding an
create_io_thread() helper, or just make copy_process() available
instead. This is ignoring the trace point for now...

I'll try and spin this up, should be pretty trivial and indeed remove
even more code and useless wait_for_completion+complete slowdowns...

-- 
Jens Axboe