On Fri, Aug 11, 2023 at 01:51:36PM -0500, Mike Christie wrote: > On 8/10/23 1:57 PM, Michael S. Tsirkin wrote: > > On Sat, Jul 22, 2023 at 11:03:29PM -0500, michael.christie@xxxxxxxxxx wrote: > >> On 7/20/23 8:06 AM, Michael S. Tsirkin wrote: > >>> On Thu, Feb 02, 2023 at 05:25:17PM -0600, Mike Christie wrote: > >>>> For vhost workers we use the kthread API which inherit's its values from > >>>> and checks against the kthreadd thread. This results in the wrong RLIMITs > >>>> being checked, so while tools like libvirt try to control the number of > >>>> threads based on the nproc rlimit setting we can end up creating more > >>>> threads than the user wanted. > >>>> > >>>> This patch has us use the vhost_task helpers which will inherit its > >>>> values/checks from the thread that owns the device similar to if we did > >>>> a clone in userspace. The vhost threads will now be counted in the nproc > >>>> rlimits. And we get features like cgroups and mm sharing automatically, > >>>> so we can remove those calls. > >>>> > >>>> Signed-off-by: Mike Christie <michael.christie@xxxxxxxxxx> > >>>> Acked-by: Michael S. Tsirkin <mst@xxxxxxxxxx> > >>> > >>> > >>> Hi Mike, > >>> So this seems to have caused a measureable regression in networking > >>> performance (about 30%). Take a look here, and there's a zip file > >>> with detailed measuraments attached: > >>> > >>> https://bugzilla.redhat.com/show_bug.cgi?id=2222603 > >>> > >>> > >>> Could you take a look please? > >>> You can also ask reporter questions there assuming you > >>> have or can create a (free) account. > >>> > >> > >> Sorry for the late reply. I just got home from vacation. > >> > >> The account creation link seems to be down. I keep getting a > >> "unable to establish SMTP connection to bz-exim-prod port 25 " error. > >> > >> Can you give me Quan's email? > >> > >> I think I can replicate the problem. I just need some extra info from Quan: > >> > >> 1. Just double check that they are using RHEL 9 on the host running the VMs. > >> 2. The kernel config > >> 3. Any tuning that was done. Is tuned running in guest and/or host running the > >> VMs and what profile is being used in each. > >> 4. Number of vCPUs and virtqueues being used. > >> 5. Can they dump the contents of: > >> > >> /sys/kernel/debug/sched > >> > >> and > >> > >> sysctl -a > >> > >> on the host running the VMs. > >> > >> 6. With the 6.4 kernel, can they also run a quick test and tell me if they set > >> the scheduler to batch: > >> > >> ps -T -o comm,pid,tid $QEMU_THREAD > >> > >> then for each vhost thread do: > >> > >> chrt -b -p 0 $VHOST_THREAD > >> > >> Does that end up increasing perf? When I do this I see throughput go up by > >> around 50% vs 6.3 when sessions was 16 or more (16 was the number of vCPUs > >> and virtqueues per net device in the VM). Note that I'm not saying that is a fix. > >> It's just a difference I noticed when running some other tests. > > > > > > Mike I'm unsure what to do at this point. Regressions are not nice > > but if the kernel is released with the new userspace api we won't > > be able to revert. So what's the plan? > > > > I'm sort of stumped. I still can't replicate the problem out of the box. 6.3 and > 6.4 perform the same for me. I've tried your setup and settings and with different > combos of using things like tuned and irqbalance. > > I can sort of force the issue. In 6.4, the vhost thread inherits it's settings > from the parent thread. In 6.3, the vhost thread inherits from kthreadd and we > would then reset the sched settings. So in 6.4 if I just tune the parent differently > I can cause different performance. If we want the 6.3 behavior we can do the patch > below. > > However, I don't think you guys are hitting this because you are just running > qemu from the normal shell and were not doing anything fancy with the sched > settings. > > > diff --git a/kernel/vhost_task.c b/kernel/vhost_task.c > index da35e5b7f047..f2c2638d1106 100644 > --- a/kernel/vhost_task.c > +++ b/kernel/vhost_task.c > @@ -2,6 +2,7 @@ > /* > * Copyright (C) 2021 Oracle Corporation > */ > +#include <uapi/linux/sched/types.h> > #include <linux/slab.h> > #include <linux/completion.h> > #include <linux/sched/task.h> > @@ -22,9 +23,16 @@ struct vhost_task { > > static int vhost_task_fn(void *data) > { > + static const struct sched_param param = { .sched_priority = 0 }; > struct vhost_task *vtsk = data; > bool dead = false; > > + /* > + * Don't inherit the parent's sched info, so we maintain compat from > + * when we used kthreads and it reset this info. > + */ > + sched_setscheduler_nocheck(current, SCHED_NORMAL, ¶m); > + > for (;;) { > bool did_work; > > > yes seems unlikely, still, attach this to bugzilla so it can be tested? and, what will help you debug? any traces to enable? Also wasn't there another issue with a non standard config? Maybe if we fix that it will by chance fix this one too? > > _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/virtualization