On Wed, Feb 14, 2024 at 6:50 PM Steven Sistare <steven.sistare@xxxxxxxxxx> wrote: > > On 2/13/2024 11:10 AM, Eugenio Perez Martin wrote: > > On Mon, Feb 12, 2024 at 6:16 PM Steve Sistare <steven.sistare@xxxxxxxxxx> wrote: > >> > >> Flush to guarantee no workers are running when suspend returns. > >> > >> Signed-off-by: Steve Sistare <steven.sistare@xxxxxxxxxx> > >> --- > >> drivers/vdpa/vdpa_sim/vdpa_sim.c | 13 +++++++++++++ > >> 1 file changed, 13 insertions(+) > >> > >> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c > >> index be2925d0d283..a662b90357c3 100644 > >> --- a/drivers/vdpa/vdpa_sim/vdpa_sim.c > >> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c > >> @@ -74,6 +74,17 @@ static void vdpasim_worker_change_mm_sync(struct vdpasim *vdpasim, > >> kthread_flush_work(work); > >> } > >> > >> +static void flush_work_fn(struct kthread_work *work) {} > >> + > >> +static void vdpasim_flush_work(struct vdpasim *vdpasim) > >> +{ > >> + struct kthread_work work; > >> + > >> + kthread_init_work(&work, flush_work_fn); > > > > If the work is already queued, doesn't it break the linked list > > because of the memset in kthread_init_work? > > work is a local variable. It completes before vdpasim_flush_work returns, > thus is never already queued on entry to vdpasim_flush_work. > Am I missing your point? > No, sorry, I was the one missing that. Thanks for explaining it :)! I'm not so used to the kthread queue, but why not calling kthread_flush_work on vdpasim->work directly? > >> + kthread_queue_work(vdpasim->worker, &work); > >> + kthread_flush_work(&work); > >> +} > >> + > >> static struct vdpasim *vdpa_to_sim(struct vdpa_device *vdpa) > >> { > >> return container_of(vdpa, struct vdpasim, vdpa); > >> @@ -511,6 +522,8 @@ static int vdpasim_suspend(struct vdpa_device *vdpa) > >> vdpasim->running = false; > >> mutex_unlock(&vdpasim->mutex); > >> > >> + vdpasim_flush_work(vdpasim); > > > > Do we need to protect the case where vdpasim_kick_vq and > > vdpasim_suspend are called "at the same time"? Correct userland should > > not be doing it but buggy or mailious could be. Just calling > > vdpasim_flush_work with the mutex acquired would solve the issue, > > doesn't it? > > Good catch. I need to serialize access to vdpasim->running plus the worker queue > in these two functions. vdpasim_kick_vq currently takes no locks. In case it is called > from non-task contexts, I should define a new spinlock to be acquired in both functions. > > - Steve >