On 2/15/2024 10:44 AM, Eugenio Perez Martin wrote: > On Wed, Feb 14, 2024 at 8:52 PM Steven Sistare > <steven.sistare@xxxxxxxxxx> wrote: >> >> On 2/14/2024 2:39 PM, Eugenio Perez Martin wrote: >>> On Wed, Feb 14, 2024 at 6:50 PM Steven Sistare >>> <steven.sistare@xxxxxxxxxx> wrote: >>>> >>>> On 2/13/2024 11:10 AM, Eugenio Perez Martin wrote: >>>>> On Mon, Feb 12, 2024 at 6:16 PM Steve Sistare <steven.sistare@xxxxxxxxxx> wrote: >>>>>> >>>>>> Flush to guarantee no workers are running when suspend returns. >>>>>> >>>>>> Signed-off-by: Steve Sistare <steven.sistare@xxxxxxxxxx> >>>>>> --- >>>>>> drivers/vdpa/vdpa_sim/vdpa_sim.c | 13 +++++++++++++ >>>>>> 1 file changed, 13 insertions(+) >>>>>> >>>>>> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c >>>>>> index be2925d0d283..a662b90357c3 100644 >>>>>> --- a/drivers/vdpa/vdpa_sim/vdpa_sim.c >>>>>> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c >>>>>> @@ -74,6 +74,17 @@ static void vdpasim_worker_change_mm_sync(struct vdpasim *vdpasim, >>>>>> kthread_flush_work(work); >>>>>> } >>>>>> >>>>>> +static void flush_work_fn(struct kthread_work *work) {} >>>>>> + >>>>>> +static void vdpasim_flush_work(struct vdpasim *vdpasim) >>>>>> +{ >>>>>> + struct kthread_work work; >>>>>> + >>>>>> + kthread_init_work(&work, flush_work_fn); >>>>> >>>>> If the work is already queued, doesn't it break the linked list >>>>> because of the memset in kthread_init_work? >>>> >>>> work is a local variable. It completes before vdpasim_flush_work returns, >>>> thus is never already queued on entry to vdpasim_flush_work. >>>> Am I missing your point? >>> >>> No, sorry, I was the one missing that. Thanks for explaining it :)! >>> >>> I'm not so used to the kthread queue, but why not calling >>> kthread_flush_work on vdpasim->work directly? >> >> vdpasim->work is not the only work posted to vdpasim->worker; see >> vdpasim_worker_change_mm_sync. Posting a new no-op work guarantees >> they are all flushed. > > But it is ok to have concurrent mm updates, isn't it? Moreover, they > can be enqueued immediately after the kthread_flush_work already, as > there is no lock protecting it. Agreed on both, thanks. I will simplify and only flush vdpasim->work. - Steve >>>>>> + kthread_queue_work(vdpasim->worker, &work); >>>>>> + kthread_flush_work(&work); >>>>>> +} >>>>>> + >>>>>> static struct vdpasim *vdpa_to_sim(struct vdpa_device *vdpa) >>>>>> { >>>>>> return container_of(vdpa, struct vdpasim, vdpa); >>>>>> @@ -511,6 +522,8 @@ static int vdpasim_suspend(struct vdpa_device *vdpa) >>>>>> vdpasim->running = false; >>>>>> mutex_unlock(&vdpasim->mutex); >>>>>> >>>>>> + vdpasim_flush_work(vdpasim); >>>>> >>>>> Do we need to protect the case where vdpasim_kick_vq and >>>>> vdpasim_suspend are called "at the same time"? Correct userland should >>>>> not be doing it but buggy or mailious could be. Just calling >>>>> vdpasim_flush_work with the mutex acquired would solve the issue, >>>>> doesn't it? >>>> >>>> Good catch. I need to serialize access to vdpasim->running plus the worker queue >>>> in these two functions. vdpasim_kick_vq currently takes no locks. In case it is called >>>> from non-task contexts, I should define a new spinlock to be acquired in both functions. >>>> >>>> - Steve >>>> >>> >> >