"Michael S. Tsirkin" <mst@xxxxxxxxxx> writes: > On Sun, May 21, 2023 at 09:51:24PM -0500, Mike Christie wrote: >> When switching from kthreads to vhost_tasks two bugs were added: >> 1. The vhost worker tasks's now show up as processes so scripts doing ps >> or ps a would not incorrectly detect the vhost task as another process. >> 2. kthreads disabled freeze by setting PF_NOFREEZE, but vhost tasks's >> didn't disable or add support for them. >> >> To fix both bugs, this switches the vhost task to be thread in the >> process that does the VHOST_SET_OWNER ioctl, and has vhost_worker call >> get_signal to support SIGKILL/SIGSTOP and freeze signals. Note that >> SIGKILL/STOP support is required because CLONE_THREAD requires >> CLONE_SIGHAND which requires those 2 signals to be suppported. >> >> This a modified version of patch originally written by Linus which >> handles his review comment to himself to rename ignore_signals to >> block_signals to better represent what it now does. And it includes a >> change to vhost_worker() to support SIGSTOP/KILL and freeze, and it >> drops the wait use per Oleg's review comment that it's no longer needed >> when using CLONE_THREAD. >> >> Fixes: 6e890c5d5021 ("vhost: use vhost_tasks for worker threads") >> Signed-off-by: Mike Christie <michael.christie@xxxxxxxxxx> >> --- >> drivers/vhost/vhost.c | 17 ++++++++++++++++- >> include/linux/sched/task.h | 2 +- >> kernel/fork.c | 12 +++--------- >> kernel/signal.c | 1 + >> kernel/vhost_task.c | 16 ++++------------ >> 5 files changed, 25 insertions(+), 23 deletions(-) >> >> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c >> index a92af08e7864..bf83e9340e40 100644 >> --- a/drivers/vhost/vhost.c >> +++ b/drivers/vhost/vhost.c >> @@ -338,6 +338,7 @@ static int vhost_worker(void *data) >> struct vhost_worker *worker = data; >> struct vhost_work *work, *work_next; >> struct llist_node *node; >> + bool dead = false; >> >> for (;;) { >> /* mb paired w/ kthread_stop */ >> @@ -349,8 +350,22 @@ static int vhost_worker(void *data) >> } >> >> node = llist_del_all(&worker->work_list); >> - if (!node) >> + if (!node) { >> schedule(); >> + /* >> + * When we get a SIGKILL our release function will >> + * be called. That will stop new IOs from being queued >> + * and check for outstanding cmd responses. It will then >> + * call vhost_task_stop to tell us to return and exit. >> + */ >> + if (!dead && signal_pending(current)) { >> + struct ksignal ksig; >> + >> + dead = get_signal(&ksig); >> + if (dead) >> + clear_thread_flag(TIF_SIGPENDING); > > > Does get_signal actually return true only on SIGKILL then? get_signal returns the signal that was dequeued, or 0 if no signal was dequeued. For these threads that block all but SIGSTOP and SIGKILL get_signal should properly never return any signal. As SIGSTOP and SIGKILL are handled internally to get_signal. However get_signal was modified to have a special case for io_uring and now the vhost code so that extra cleanup can be performed, before do_exit is called on the thread. That special case causes get_signal to return when with the value of SIGKILL when the process exits. The process can exit with do_group_exit aka exit(2) or any fatal signal not just SIGKILL, and get_signal on these threads will return. Eric _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/virtualization