[CCing the regression list] On 06.05.23 00:37, Mike Christie wrote: > On 5/5/23 1:22 PM, Linus Torvalds wrote: >> On Fri, May 5, 2023 at 6:40 AM Nicolas Dichtel >> <nicolas.dichtel@xxxxxxxxx> wrote: >>> >>> Is this an intended behavior? >>> This breaks some of our scripts. Jumping in here, as I found another problem with that patch: it broke s2idle on my laptop when a qemu-kvm VM is running, as freezing user space processes now fails for me: ``` [ 195.442949] PM: suspend entry (s2idle) [ 195.641271] Filesystems sync: 0.198 seconds [ 195.833828] Freezing user space processes [ 215.841084] Freezing user space processes failed after 20.007 seconds (1 tasks refusing to freeze, wq_busy=0): [ 215.841255] task:vhost-3221 state:R stack:0 pid:3250 ppid:3221 flags:0x00004006 [ 215.841264] Call Trace: [ 215.841266] <TASK> [ 215.841270] ? update_rq_clock+0x39/0x270 [ 215.841283] ? _raw_spin_unlock+0x19/0x40 [ 215.841290] ? __schedule+0x3f/0x1510 [ 215.841296] ? sysvec_apic_timer_interrupt+0xaf/0xd0 [ 215.841306] ? schedule+0x61/0xe0 [ 215.841313] ? vhost_worker+0x87/0xb0 [vhost] [ 215.841329] ? vhost_task_fn+0x1a/0x30 [ 215.841336] ? __pfx_vhost_task_fn+0x10/0x10 [ 215.841341] ? ret_from_fork+0x2c/0x50 [ 215.841352] </TASK> [ 215.841936] OOM killer enabled. [ 215.841938] Restarting tasks ... done. [ 215.844204] random: crng reseeded on system resumption [ 215.957095] PM: suspend exit [ 215.957185] PM: suspend entry (s2idle) [ 215.967646] Filesystems sync: 0.010 seconds [ 215.971326] Freezing user space processes [ 235.974400] Freezing user space processes failed after 20.003 seconds (1 tasks refusing to freeze, wq_busy=0): [ 235.974574] task:vhost-3221 state:R stack:0 pid:3250 ppid:3221 flags:0x00004806 [ 235.974583] Call Trace: [ 235.974586] <TASK> [ 235.974593] ? __schedule+0x184/0x1510 [ 235.974605] ? sysvec_apic_timer_interrupt+0xaf/0xd0 [ 235.974616] ? schedule+0x61/0xe0 [ 235.974624] ? vhost_worker+0x87/0xb0 [vhost] [ 235.974648] ? vhost_task_fn+0x1a/0x30 [ 235.974656] ? __pfx_vhost_task_fn+0x10/0x10 [ 235.974662] ? ret_from_fork+0x2c/0x50 [ 235.974673] </TASK> [ 235.975190] OOM killer enabled. [ 235.975192] Restarting tasks ... done. [ 235.978131] random: crng reseeded on system resumption [ 236.091219] PM: suspend exit ``` After running into the problem I booted 6.3.1-rc1 again and there s2idle still worked. Didn't do a bisection, just looked at the vhost commits during the latest merge window; 6e890c5d502 ("vhost: use vhost_tasks for worker threads") looked suspicious, so I reverted it on top of latest mainline and then things work again. Through a search on lore I arrived in this thread and found below patch from Mike. Gave it a try on top of latest mainline, but it didn't help. Ciao, Thorsten > [...] > If it's ok to change the behavior of "ps -u root", then we can do this patch: > (Nicolas, I confirmed it fixes the 'ps a' case, but couldn't replicate the 'ps' > case. If you could test the ps only case or give me info on what /usr/bin/example > was doing I can replicate and test here): > > > diff --git a/kernel/fork.c b/kernel/fork.c > index ed4e01daccaa..eb9ffc58e211 100644 > --- a/kernel/fork.c > +++ b/kernel/fork.c > @@ -2269,8 +2269,14 @@ __latent_entropy struct task_struct *copy_process( > /* > * Thread groups must share signals as well, and detached threads > * can only be started up within the thread group. > + * > + * A userworker's parent thread will normally have a signal handler > + * that performs management operations, but the worker will not > + * because the parent will handle the signal then user a worker > + * specific interface to manage the thread and related resources. > */ > - if ((clone_flags & CLONE_THREAD) && !(clone_flags & CLONE_SIGHAND)) > + if ((clone_flags & CLONE_THREAD) && !(clone_flags & CLONE_SIGHAND) && > + !args->user_worker && !args->ignore_signals) > return ERR_PTR(-EINVAL); > > /* > diff --git a/kernel/vhost_task.c b/kernel/vhost_task.c > index b7cbd66f889e..3700c21ea39d 100644 > --- a/kernel/vhost_task.c > +++ b/kernel/vhost_task.c > @@ -75,7 +78,8 @@ struct vhost_task *vhost_task_create(int (*fn)(void *), void *arg, > const char *name) > { > struct kernel_clone_args args = { > - .flags = CLONE_FS | CLONE_UNTRACED | CLONE_VM, > + .flags = CLONE_FS | CLONE_THREAD | CLONE_VM | > + CLONE_UNTRACED, > .exit_signal = 0, > .fn = vhost_task_fn, > .name = name _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/virtualization