On Wed, Dec 13, 2023 at 07:00:53AM -0500, Michael S. Tsirkin wrote: > On Wed, Dec 13, 2023 at 11:37:23AM +0100, Tobias Huschle wrote: > > On Tue, Dec 12, 2023 at 11:15:01AM -0500, Michael S. Tsirkin wrote: > > > On Tue, Dec 12, 2023 at 11:00:12AM +0800, Jason Wang wrote: > > > > On Tue, Dec 12, 2023 at 12:54 AM Michael S. Tsirkin <mst@xxxxxxxxxx> wrote: [...] > > Apparently schedule is already called? > What about this: static int vhost_task_fn(void *data) { <...> did_work = vtsk->fn(vtsk->data); --> this calls vhost_worker if I'm not mistaken if (!did_work) schedule(); <...> } static bool vhost_worker(void *data) { struct vhost_worker *worker = data; struct vhost_work *work, *work_next; struct llist_node *node; node = llist_del_all(&worker->work_list); if (node) { <...> llist_for_each_entry_safe(work, work_next, node, node) { <...> } } return !!node; } The llist_for_each_entry_safe does not actually change the node value, doesn't it? If it does not change it, !!node would return 1. Thereby skipping the schedule. This was changed recently with: f9010dbdce91 fork, vhost: Use CLONE_THREAD to fix freezer/ps regression It returned a hardcoded 0 before. The commit message explicitly mentions this change to make vhost_worker return 1 if it did something. Seems indeed like a nasty little side effect caused by EEVDF not scheduling the woken up kworker right away.