On Wed, 3 Feb 2021 18:13:09 -0500 Tony Krowiak <akrowiak@xxxxxxxxxxxxx> wrote: > On 1/12/21 12:55 PM, Halil Pasic wrote: > > On Tue, 12 Jan 2021 02:12:51 +0100 > > Halil Pasic <pasic@xxxxxxxxxxxxx> wrote: > > > >>> @@ -1347,8 +1437,11 @@ void vfio_ap_mdev_remove_queue(struct ap_device *apdev) > >>> apqi = AP_QID_QUEUE(q->apqn); > >>> vfio_ap_mdev_reset_queue(apid, apqi, 1); > >>> > >>> - if (q->matrix_mdev) > >>> + if (q->matrix_mdev) { > >>> + matrix_mdev = q->matrix_mdev; > >>> vfio_ap_mdev_unlink_queue(q); > >>> + vfio_ap_mdev_refresh_apcb(matrix_mdev); > >>> + } > >>> > >>> kfree(q); > >>> mutex_unlock(&matrix_dev->lock); > > Shouldn't we first remove the queue from the APCB and then > > reset? Sorry, I missed this one yesterday. > > I agreed to move the reset, however if the remove callback is > invoked due to a manual unbind of the queue and the queue is > in use by a guest, the cleanup of the IRQ resources after the > reset of the queue will not happen because the link from the > queue to the matrix mdev was removed. Consequently, I'm going > to have to change the patch 05/15 to split the vfio_ap_mdev_unlink_queue() > function into two functions: one to remove the link from the matrix mdev to > the queue; and, one to remove the link from the queue to the matrix > mdev. Does that mean we should reset before the unlink (or before the second part of it after the split up)? I mean have a look at unassign_adapter_store() with all patches of this series applied. It does an unlink but doesn't do any reset, or cleanup IRQ resources. And after the unlink we can't clean up the IRQ resources properly. But before all this we should resolve this circular lock dependency problem in a satisfactory way. I'm quite worried about how it is going to mesh with this series and dynamic ap pass-through. Regards, Halil >Only the first will be used for the remove callback which should > be fine since the queue object is freed at the end of the remove > function anyway. > > > > > Regards, > > Halil >