Hi Paolo, Michal Paolo: Will you accept a patch which uses real_parent in kvm_vm_worker_thread() as suggested by Sean, while I figure out the recommendation from Michal about making kthread_stop() wait on kernel_wait()? cgroup_attach_task_all(current->real_parent, current) Michal: On Thu, Jan 20, 2022 at 7:05 AM Michal Koutný <mkoutny@xxxxxxxx> wrote: > > On Wed, Jan 19, 2022 at 08:30:43AM -1000, Tejun Heo <tj@xxxxxxxxxx> wrote: > > It'd be nicer if we can make kthread_stop() waiting more regular but I > > couldn't find a good existing place and routing the usual parent > > signaling might be too complicated. Anyone has better ideas? > > The regular way is pictured in Paolo's diagram already, the > exit_notify/do_signal_parent -> wait4 path. > > Actually, I can see that there exists already kernel_wait() and is used > by a UMH wrapper kthread. kthreadd issues ignore_signals() so (besides > no well defined point of signalling a kthread) the signal notification > is moot and only waking up the waiter is relevant. So kthread_stop() > could wait via kernel_wait() based on pid (extracted from task_struct). > > Have I missed an obstacle? > I must admit I do not have a good understanding of kernel_wait() and kthread_stop() APIs. I tried making some changes in the kthread_stop() but I was not able to successfully use the API. I tested it by a writing a test module, where during the init I start a kthread which prints some message every few seconds and during the module exit I call kernel_stop(). This module worked as intended without the kernel_wait() changes in the kthread_stop() API. My changes were basically replacing wait_for_completion() with kernel_wait() call. @@ -645,8 +645,9 @@ int kthread_stop(struct task_struct *k) set_bit(KTHREAD_SHOULD_STOP, &kthread->flags); kthread_unpark(k); wake_up_process(k); - wait_for_completion(&kthread->exited); - ret = k->exit_code; + kernel_wait(k->pid, &ret); +// kernel_wait(task_pid_vnr(k), &ret); +// wait_for_completion(&kthread->exited); +// ret = k->exit_code; put_task_struct(k); I used few other combination where I put kernel_wait() call after put_task_struct(k) call. Every time during the module exit, kernel was crashing like: [ 285.014612] RIP: 0010:0xffffffffc04ed074 [ 285.018537] RSP: 0018:ffff9ccdc8365ee8 EFLAGS: 00010246 [ 285.023761] RAX: 0000000000000000 RBX: 0000000000000012 RCX: 0000000000000001 [ 285.030896] RDX: 0000000000000000 RSI: 0000000000000286 RDI: ffff9cce3f7d9cc0 [ 285.038028] RBP: ffff9ccdc8365ef8 R08: 0000000000000000 R09: 0000000000015504 [ 285.045160] R10: 000000000000004b R11: ffffffff8dd92880 R12: 0000000000000012 [ 285.052293] R13: ffff9ccdc813db90 R14: ffff9ccdc7e66240 R15: ffffffffc04ed000 [ 285.059425] FS: 0000000000000000(0000) GS:ffff9cce3f7c0000(0000) knlGS:0000000000000000 [ 285.067510] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 285.073258] CR2: ffffffffc04ed074 CR3: 000000c07f20e002 CR4: 0000000000362ef0 [ 285.080390] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 285.087522] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 285.094656] Call Trace: [ 285.097112] kthread+0x148/0x1b0 [ 285.100343] ? kthread_blkcg+0x30/0x30 [ 285.104096] ret_from_fork+0x3a/0x60 [ 285.107671] Code: Bad RIP value. [ 285.107671] IP: 0xffffffffc04ecff4: Crash is not observed if I keep wait_for_completion(&kthread->exited) along with kernel_wait(), but I guess the kernel_wait() should be sufficient by itself if I figure out proper way to use it. Do you have any suggestions what might be the right way to use this API? Thanks Vipin