On 3/4/21 11:56 AM, Linus Torvalds wrote: > On Thu, Mar 4, 2021 at 10:19 AM Jens Axboe <axboe@xxxxxxxxx> wrote: >> >> How about this - it moves the signal fiddling into the task >> itself, and leaves the parent alone. Also allows future cleanups >> of how we wait for thread creation. > > Ugh, I think this is wrong. > > You shouldn't usekernel_thread() at all, and you shouldn't need to set > the sigmask in the parent, only to have it copied to the child, and > then restore it in the parent. > > You shouldn't have to have that silly extra scheduling rendezvous with > the completion, which forces two schedules (first a schedule to the > child to do what it wants to do, and then "complete()" there to wake > up the parent that is waiting for the completion. > > The thing is, our internal thread creation functionality is already > written explicitly to not need any of this: the creation of a new > thread is a separate phase, and then you do some setup, and then you > actually tell the new thread "ok, go go go". > > See the kernel_clone() function kernel/fork.c for the structure of this all. > > You really should just do > > (a) copy_thread() to create a new child that is inactive and cannot yet run > > (b) do any setup in that new child (like setting the signal mask in > it, but also perhaps setting the PF_IO_WORKER flag etc) > > (c) actually say "go go go": wake_up_new_task(p); > > and you're done. No completions, no "set temporary mask in parent to > be copied", no nothing like that. > > And for the IO worker threads, you really don't want all the other > stuff that kernel_clone() does. You don't want the magic VFORK "wait > for the child to release the VM we gave it". You don't want the clone > ptrace setup, because you can't ptrace those IO workler threads > anyway. You might want a tracepoint, but you probably want a > _different_ tracepoint than the "normal clone" one. You don't want the > latent entropy addition, because honestly, the thing has no entropy to > add either. > > So I think you really want to just add a new "create_io_thread()" > inside kernel/fork.c, which is a very cut-down and specialized version > of kernel_clone(). > > It's actually going to be _less_ code than what you have now, and it's > going to avoid all the problems with anmy half-way state or "set > parent state to something that gets copied and then undo the parent > state after the copy". Took a quick look at this, and I agree that's _much_ better. In fact, it boils down to just calling copy_process() and then having the caller do wake_up_new_task(). So not sure if it's worth adding an create_io_thread() helper, or just make copy_process() available instead. This is ignoring the trace point for now... I'll try and spin this up, should be pretty trivial and indeed remove even more code and useless wait_for_completion+complete slowdowns... -- Jens Axboe