Re: [PATCH V2 3/3] vdpa_sim: flush workers on suspend

Steven Sistare <steven.sistare@xxxxxxxxxx> · Fri, 16 Feb 2024 10:15:45 -0500

On 2/15/2024 10:44 AM, Eugenio Perez Martin wrote:
> On Wed, Feb 14, 2024 at 8:52 PM Steven Sistare
> <steven.sistare@xxxxxxxxxx> wrote:
>>
>> On 2/14/2024 2:39 PM, Eugenio Perez Martin wrote:
>>> On Wed, Feb 14, 2024 at 6:50 PM Steven Sistare
>>> <steven.sistare@xxxxxxxxxx> wrote:
>>>>
>>>> On 2/13/2024 11:10 AM, Eugenio Perez Martin wrote:
>>>>> On Mon, Feb 12, 2024 at 6:16 PM Steve Sistare <steven.sistare@xxxxxxxxxx> wrote:
>>>>>>
>>>>>> Flush to guarantee no workers are running when suspend returns.
>>>>>>
>>>>>> Signed-off-by: Steve Sistare <steven.sistare@xxxxxxxxxx>
>>>>>> ---
>>>>>>  drivers/vdpa/vdpa_sim/vdpa_sim.c | 13 +++++++++++++
>>>>>>  1 file changed, 13 insertions(+)
>>>>>>
>>>>>> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
>>>>>> index be2925d0d283..a662b90357c3 100644
>>>>>> --- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
>>>>>> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
>>>>>> @@ -74,6 +74,17 @@ static void vdpasim_worker_change_mm_sync(struct vdpasim *vdpasim,
>>>>>>         kthread_flush_work(work);
>>>>>>  }
>>>>>>
>>>>>> +static void flush_work_fn(struct kthread_work *work) {}
>>>>>> +
>>>>>> +static void vdpasim_flush_work(struct vdpasim *vdpasim)
>>>>>> +{
>>>>>> +       struct kthread_work work;
>>>>>> +
>>>>>> +       kthread_init_work(&work, flush_work_fn);
>>>>>
>>>>> If the work is already queued, doesn't it break the linked list
>>>>> because of the memset in kthread_init_work?
>>>>
>>>> work is a local variable.  It completes before vdpasim_flush_work returns,
>>>> thus is never already queued on entry to vdpasim_flush_work.
>>>> Am I missing your point?
>>>
>>> No, sorry, I was the one missing that. Thanks for explaining it :)!
>>>
>>> I'm not so used to the kthread queue, but why not calling
>>> kthread_flush_work on vdpasim->work directly?
>>
>> vdpasim->work is not the only work posted to vdpasim->worker; see
>> vdpasim_worker_change_mm_sync.  Posting a new no-op work guarantees
>> they are all flushed.
> 
> But it is ok to have concurrent mm updates, isn't it? Moreover, they
> can be enqueued immediately after the kthread_flush_work already, as
> there is no lock protecting it.

Agreed on both, thanks.  I will simplify and only flush vdpasim->work.

- Steve

>>>>>> +       kthread_queue_work(vdpasim->worker, &work);
>>>>>> +       kthread_flush_work(&work);
>>>>>> +}
>>>>>> +
>>>>>>  static struct vdpasim *vdpa_to_sim(struct vdpa_device *vdpa)
>>>>>>  {
>>>>>>         return container_of(vdpa, struct vdpasim, vdpa);
>>>>>> @@ -511,6 +522,8 @@ static int vdpasim_suspend(struct vdpa_device *vdpa)
>>>>>>         vdpasim->running = false;
>>>>>>         mutex_unlock(&vdpasim->mutex);
>>>>>>
>>>>>> +       vdpasim_flush_work(vdpasim);
>>>>>
>>>>> Do we need to protect the case where vdpasim_kick_vq and
>>>>> vdpasim_suspend are called "at the same time"? Correct userland should
>>>>> not be doing it but buggy or mailious could be. Just calling
>>>>> vdpasim_flush_work with the mutex acquired would solve the issue,
>>>>> doesn't it?
>>>>
>>>> Good catch.  I need to serialize access to vdpasim->running plus the worker queue
>>>> in these two functions.  vdpasim_kick_vq currently takes no locks. In case it is called
>>>> from non-task contexts, I should define a new spinlock to be acquired in both functions.
>>>>
>>>> - Steve
>>>>
>>>
>>
>