On Wed, Dec 13, 2023 at 01:45:35PM +0100, Tobias Huschle wrote: > On Wed, Dec 13, 2023 at 07:00:53AM -0500, Michael S. Tsirkin wrote: > > On Wed, Dec 13, 2023 at 11:37:23AM +0100, Tobias Huschle wrote: > > > On Tue, Dec 12, 2023 at 11:15:01AM -0500, Michael S. Tsirkin wrote: > > > > On Tue, Dec 12, 2023 at 11:00:12AM +0800, Jason Wang wrote: > > > > > On Tue, Dec 12, 2023 at 12:54 AM Michael S. Tsirkin <mst@xxxxxxxxxx> wrote: > > [...] > > > > Apparently schedule is already called? > > > > What about this: > > static int vhost_task_fn(void *data) > { > <...> > did_work = vtsk->fn(vtsk->data); --> this calls vhost_worker if I'm not mistaken > if (!did_work) > schedule(); > <...> > } > > static bool vhost_worker(void *data) > { > struct vhost_worker *worker = data; > struct vhost_work *work, *work_next; > struct llist_node *node; > > node = llist_del_all(&worker->work_list); > if (node) { > <...> > llist_for_each_entry_safe(work, work_next, node, node) { > <...> > } > } > > return !!node; > } > > The llist_for_each_entry_safe does not actually change the node value, doesn't it? > > If it does not change it, !!node would return 1. > Thereby skipping the schedule. > > This was changed recently with: > f9010dbdce91 fork, vhost: Use CLONE_THREAD to fix freezer/ps regression > > It returned a hardcoded 0 before. The commit message explicitly mentions this > change to make vhost_worker return 1 if it did something. > > Seems indeed like a nasty little side effect caused by EEVDF not scheduling > the woken up kworker right away. So we are actually making an effort to be nice. Documentation/kernel-hacking/hacking.rst says: If you're doing longer computations: first think userspace. If you **really** want to do it in kernel you should regularly check if you need to give up the CPU (remember there is cooperative multitasking per CPU). Idiom:: cond_resched(); /* Will sleep */ and this is what vhost.c does. At this point I'm not sure why it's appropriate to call schedule() as opposed to cond_resched(). Ideas? -- MST