On Thu 2014-11-20 19:00:16, Michael S. Tsirkin wrote: > On Thu, Nov 20, 2014 at 05:55:58PM +0100, Petr Mladek wrote: > > On Thu 2014-11-20 11:29:35, Tejun Heo wrote: > > > On Thu, Nov 20, 2014 at 06:26:24PM +0200, Michael S. Tsirkin wrote: > > > > On Thu, Nov 20, 2014 at 06:25:43PM +0200, Michael S. Tsirkin wrote: > > > > > On Thu, Nov 20, 2014 at 11:07:46AM -0500, Tejun Heo wrote: > > > > > > On Thu, Nov 20, 2014 at 05:03:17PM +0100, Petr Mladek wrote: > > > > > > ... > > > > > > > @@ -476,7 +460,6 @@ static void virtballoon_remove(struct virtio_device *vdev) > > > > > > > { > > > > > > > struct virtio_balloon *vb = vdev->priv; > > > > > > > > > > > > > > - kthread_stop(vb->thread); > > > > > > > remove_common(vb); > > > > > > > kfree(vb); > > > > > > > } > > > > > > > > > > > > Shouldn't the work item be flushed before removal is complete? > > > > Great catch! > > > > > > > In fact, flushing it won't help because it can requeue itself, right? > > > > > > There's cancel_work_sync() to stop the self-requeueing ones. > > > > Ah, one more problem is that remove_common(vb) calls leak_balloon() > > that queues the work if not finished. We would need to add some flag > > or variant that would disable the queuing when called here. > > > > That's why Tejun suggested cancel_work_sync, IIUC it stops > the requeuing without need for extra flags. But he also wrote that it handles only self-queuing. The queuing from external locations need to be prevented other ways. > > > > From that POV a dedicated WQ kept it simple. > > > > > > A dedicated wq doesn't do anything for that. You can't shut down a > > > workqueue with a pending work item on it. destroy_workqueue() will > > > try to drain the target wq, warn if it doesn't finish in certain > > > number of iterations and just keep trying indefinitely. > > > > I wonder if it is guaranteed that none would trigger > > stats_request() or virtballoon_changed() when virtballoon_remove() is > > being called. I guess so because the original code would fail > > otherwise. The two functions access "vb->config_change" > > and the structure is freed in virtballoon_remove() without > > any protection. > > > > I am trying to confirm this by reading the code but it is not that > > easy. > > > > Best Regards, > > Petr > > It's synchronized through hardware. remove_common calls reset and > del_vqs which will prevent new interrupts. I see, it means that stats_request() or virtballoon_changed() can be called until vb->vdev->config->reset(vb->vdev); is called in remove_common(). It means that fill_balloon() can be queued and proceed after we leak all pages and before we reset the devices in remove_common(). I have to think about a way how to avoid this. Maybe add some flag into struct virtio_balloon that would signalize that the balloon is being removed and new operations should not longer be queued. But there might be a more elegant solution. Best Regards, Petr _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/virtualization