On Tue, Apr 30, 2024 at 08:01:11PM -0500, Mike Christie wrote: > On 4/30/24 7:15 PM, Hillf Danton wrote: > > On Tue, Apr 30, 2024 at 11:23:04AM -0500, Mike Christie wrote: > >> On 4/30/24 8:05 AM, Edward Adam Davis wrote: > >>> static int vhost_task_fn(void *data) > >>> { > >>> struct vhost_task *vtsk = data; > >>> @@ -51,7 +51,7 @@ static int vhost_task_fn(void *data) > >>> schedule(); > >>> } > >>> > >>> - mutex_lock(&vtsk->exit_mutex); > >>> + mutex_lock(&exit_mutex); > >>> /* > >>> * If a vhost_task_stop and SIGKILL race, we can ignore the SIGKILL. > >>> * When the vhost layer has called vhost_task_stop it's already stopped > >>> @@ -62,7 +62,7 @@ static int vhost_task_fn(void *data) > >>> vtsk->handle_sigkill(vtsk->data); > >>> } > >>> complete(&vtsk->exited); > >>> - mutex_unlock(&vtsk->exit_mutex); > >>> + mutex_unlock(&exit_mutex); > >>> > >> > >> Edward, thanks for the patch. I think though I just needed to swap the > >> order of the calls above. > >> > >> Instead of: > >> > >> complete(&vtsk->exited); > >> mutex_unlock(&vtsk->exit_mutex); > >> > >> it should have been: > >> > >> mutex_unlock(&vtsk->exit_mutex); > >> complete(&vtsk->exited); > > > > JFYI Edward did it [1] > > > > [1] https://lore.kernel.org/lkml/tencent_546DA49414E876EEBECF2C78D26D242EE50A@xxxxxx/ > > Thanks. > > I tested the code with that change and it no longer triggers the UAF. Weird but syzcaller said that yes it triggers. Compare 000000000000dcc0ca06174e65d4@xxxxxxxxxx which tests the order mutex_unlock(&vtsk->exit_mutex); complete(&vtsk->exited); that you like and says it triggers and 00000000000097bda906175219bc@xxxxxxxxxx which says it does not trigger. Whatever you do please send it to syzcaller in the original thread and then when you post please include the syzcaller report. Given this gets confusing I'm fine with just a fixup patch, and note in the commit log where I should squash it. > I've fixed up the original patch that had the bug and am going to > resubmit the patchset like how Michael requested. >