On Fri, Nov 12, 2021 at 11:34 PM Petr Mladek <pmladek@xxxxxxxx> wrote: > > On Mon 2021-11-08 08:41:42, Yafang Shao wrote: > > When I was implementing a new per-cpu kthread cfs_migration, I found the > > comm of it "cfs_migration/%u" is truncated due to the limitation of > > TASK_COMM_LEN. For example, the comm of the percpu thread on CPU10~19 are > > all with the same name "cfs_migration/1", which will confuse the user. This > > issue is not critical, because we can get the corresponding CPU from the > > task's Cpus_allowed. But for kthreads correspoinding to other hardware > > devices, it is not easy to get the detailed device info from task comm, > > for example, > > > > After this change, the full name of these truncated kthreads will be > > displayed via /proc/[pid]/comm: > > > > --- a/fs/proc/array.c > > +++ b/fs/proc/array.c > > @@ -102,6 +103,8 @@ void proc_task_name(struct seq_file *m, struct task_struct *p, bool escape) > > > > if (p->flags & PF_WQ_WORKER) > > wq_worker_comm(tcomm, sizeof(tcomm), p); > > Just for record. I though that this patch obsoleted wq_worker_comm() > but it did not. wq_worker_comm() returns different values > depending on the last proceed work item and has to stay. > Right. worker comm is changed dynamically, which is combined by (task_comm+worker_desc) or (task_comm-worker_desc). I planned to remove the whole worker->desc and set it dynamically to the new kthread full_name but I found it may not be a good idea. > > + else if (p->flags & PF_KTHREAD) > > + get_kthread_comm(tcomm, sizeof(tcomm), p); > > else > > __get_task_comm(tcomm, sizeof(tcomm), p); > > > > --- a/kernel/kthread.c > > +++ b/kernel/kthread.c > > @@ -121,6 +135,7 @@ void free_kthread_struct(struct task_struct *k) > > Hmm, there is the following comment: > > /* > * Can be NULL if this kthread was created by kernel_thread() > * or if kmalloc() in kthread() failed. > */ > kthread = to_kthread(k); > > And indeed, set_kthread_struct() is called only by kthread() > and init_idle(). > > For example, call_usermodehelper_exec_sync() calls kernel_thread() > but given @fn does not call set_kthread_struct(). Also init_idle() > continues even when the allocation failed. > Yes, it really can be NULL. > > > #ifdef CONFIG_BLK_CGROUP > > WARN_ON_ONCE(kthread && kthread->blkcg_css); > > #endif > > + kfree(kthread->full_name); > > Hence, we have to make sure that it is not NULL here. I suggest > something like: > Agreed. I will do it. > void free_kthread_struct(struct task_struct *k) > { > struct kthread *kthread; > > /* > * Can be NULL if this kthread was created by kernel_thread() > * or if kmalloc() in kthread() failed. > */ > kthread = to_kthread(k); > if (!kthread) > return; > > #ifdef CONFIG_BLK_CGROUP > WARN_ON_ONCE(kthread->blkcg_css); > #endif > kfree(kthread->full_name); > kfree(kthread); > } > > > Side note: The possible NULL pointer looks dangerous to > me. to_kthread() is dereferenced without any check on > several locations. > > For example, kthread_create_on_cpu() looks safe. It is a kthread > crated by kthread(). It will exists only when the allocation > succeeded. > > kthread_stop() is probably safe only because it used only for > the classic kthreads created by kthread(). But the API > is not safe. > > kthread_use_mm() is probably used only by classic kthreads as > well. But it is less clear to me. > > All this unsafe APIs looks like a ticking bomb to me. But > it is beyond this patchset. > I will analyze it in depth and try to dismantle this ticking bomb. -- Thanks Yafang