Re: [PATCH 0/9] vhost: Support SIGKILL by flushing and exiting

Jason Wang <jasowang@xxxxxxxxxx> · Thu, 11 Apr 2024 16:39:48 +0800

On Sat, Mar 16, 2024 at 8:47 AM Mike Christie
<michael.christie@xxxxxxxxxx> wrote:
>
> The following patches were made over Linus's tree and also apply over
> mst's vhost branch. The patches add the ability for vhost_tasks to
> handle SIGKILL by flushing queued works, stop new works from being
> queued, and prepare the task for an early exit.
>
> This removes the need for the signal/coredump hacks added in:
>
> Commit f9010dbdce91 ("fork, vhost: Use CLONE_THREAD to fix freezer/ps regression")
>
> when the vhost_task patches were initially merged and fix the issue
> in this thread:
>
> https://lore.kernel.org/all/000000000000a41b82060e875721@xxxxxxxxxx/
>
> Long Background:
>
> The original vhost worker code didn't support any signals. If the
> userspace application that owned the worker got a SIGKILL, the app/
> process would exit dropping all references to the device and then the
> file operation's release function would be called. From there we would
> wait on running IO then cleanup the device's memory.

A dumb question.

Is this a user space noticeable change? For example, with this series
a SIGKILL may shutdown the datapath ...

Thanks

>
> When we switched to vhost_tasks being a thread in the owner's process we
> added some hacks to the signal/coredump code so we could continue to
> wait on running IO and process it from the vhost_task. The idea was that
> we would eventually remove the hacks. We recently hit this bug:
>
> https://lore.kernel.org/all/000000000000a41b82060e875721@xxxxxxxxxx/
>
> It turns out only vhost-scsi had an issue where it would send a command
> to the block/LIO layer, wait for a response and then process in the vhost
> task. So patches 1-5 prepares vhost-scsi to handle when the vhost_task
> is killed while we still have commands outstanding. The next patches then
> prepare and convert the vhost and vhost_task layers to handle SIGKILL
> by flushing running works, marking the vhost_task as dead so there's
> no future uses, then exiting.
>
>
>