On 10/5/21 7:50 AM, Christian Brauner wrote: > On Mon, Oct 04, 2021 at 02:21:24PM -0500, Mike Christie wrote: >> The vhost layer has similar requirements as io_uring where its worker >> threads need to access the userspace thread's memory, want to inherit the >> parents's cgroups and namespaces, and be checked against the parent's >> RLIMITs. Right now, the vhost layer uses the kthread API which has >> kthread_use_mm for mem access, and those threads can use >> cgroup_attach_task_all for v1 cgroups, but there are no helpers for the >> other items. >> >> This adds a helper to clone a process so we can inherit everything we >> want in one call. It's a more generic version of create_io_thread which >> will be used by the vhost layer and io_uring in later patches in this set. >> >> Signed-off-by: Mike Christie <michael.christie@xxxxxxxxxx> >> Acked-by: Christian Brauner <christian.brauner@xxxxxxxxxx> >> --- >> include/linux/sched/task.h | 6 ++++- >> kernel/fork.c | 48 ++++++++++++++++++++++++++++++++++++++ >> 2 files changed, 53 insertions(+), 1 deletion(-) >> >> diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h >> index e165cc67fd3c..ba0499b6627c 100644 >> --- a/include/linux/sched/task.h >> +++ b/include/linux/sched/task.h >> @@ -87,7 +87,11 @@ extern void exit_files(struct task_struct *); >> extern void exit_itimers(struct signal_struct *); >> >> extern pid_t kernel_clone(struct kernel_clone_args *kargs); >> -struct task_struct *create_io_thread(int (*fn)(void *), void *arg, int node); >> +struct task_struct *create_io_thread(int (*fn)(void *i), void *arg, int node); >> +struct task_struct *kernel_worker(int (*fn)(void *), void *arg, int node, >> + unsigned long clone_flags, u32 worker_flags); >> +__printf(2, 3) >> +void kernel_worker_start(struct task_struct *tsk, const char namefmt[], ...); >> struct task_struct *fork_idle(int); >> struct mm_struct *copy_init_mm(void); >> extern pid_t kernel_thread(int (*fn)(void *), void *arg, unsigned long flags); >> diff --git a/kernel/fork.c b/kernel/fork.c >> index 98264cf1d6a6..3f3fcabffa5f 100644 >> --- a/kernel/fork.c >> +++ b/kernel/fork.c >> @@ -2540,6 +2540,54 @@ struct task_struct *create_io_thread(int (*fn)(void *), void *arg, int node) >> return copy_process(NULL, 0, node, &args); >> } >> >> +/** >> + * kernel_worker - create a copy of a process to be used by the kernel >> + * @fn: thread stack >> + * @arg: data to be passed to fn >> + * @node: numa node to allocate task from >> + * @clone_flags: CLONE flags >> + * @worker_flags: KERN_WORKER flags >> + * >> + * This returns a created task, or an error pointer. The returned task is >> + * inactive, and the caller must fire it up through kernel_worker_start(). If >> + * this is an PF_IO_WORKER all singals but KILL and STOP are blocked. >> + */ >> +struct task_struct *kernel_worker(int (*fn)(void *), void *arg, int node, >> + unsigned long clone_flags, u32 worker_flags) >> +{ >> + struct kernel_clone_args args = { >> + .flags = ((lower_32_bits(clone_flags) | CLONE_VM | >> + CLONE_UNTRACED) & ~CSIGNAL), >> + .exit_signal = (lower_32_bits(clone_flags) & CSIGNAL), >> + .stack = (unsigned long)fn, >> + .stack_size = (unsigned long)arg, >> + .worker_flags = KERN_WORKER_USER | worker_flags, >> + }; >> + >> + return copy_process(NULL, 0, node, &args); >> +} >> +EXPORT_SYMBOL_GPL(kernel_worker); >> + >> +/** >> + * kernel_worker_start - Start a task created with kernel_worker >> + * @tsk: task to wake up >> + * @namefmt: printf-style format string for the thread name >> + * @arg: arguments for @namefmt >> + */ >> +void kernel_worker_start(struct task_struct *tsk, const char namefmt[], ...) >> +{ >> + char name[TASK_COMM_LEN]; >> + va_list args; > > You could think about reporting an error from this function if > KERN_WORK_USER isn't set or only call the below when KERN_WORK_USER is > set. Both options are fine. > I'm not sure how to handle this comment, because I might have misread an older comment or made it up in my head. KERN_WORK_USER is only set on the kernel_clone_args, so at this point we don't have that struct available anymore. I didn't add a new PF_KTHREAD_WORK_USER flag to sched.h, because I thought I had got a review comment to not add another PF flag for this. However, I can't seem to find that comment now so I'm not sure if maybe I misread a comment or made it up. If it's ok I could add a PF_KTHREAD_WORK_USER, then do a: WARN_ON(!(tsk->flags & PF_KTHREAD_WORK_USER) so future developers get loud feedback they are doing the wrong thing right away. _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/virtualization