On Mon, Apr 26, 2021 at 07:48:59PM +0200, Cornelia Huck wrote: > On Fri, 23 Apr 2021 20:03:03 -0300 > Jason Gunthorpe <jgg@xxxxxxxxxx> wrote: > > > This is straightforward conversion, the ap_matrix_mdev is actually serving > > as the vfio_device and we can replace all the mdev_get_drvdata()'s with a > > simple container_of(). > > > > Signed-off-by: Jason Gunthorpe <jgg@xxxxxxxxxx> > > drivers/s390/crypto/vfio_ap_ops.c | 137 ++++++++++++++++---------- > > drivers/s390/crypto/vfio_ap_private.h | 2 + > > 2 files changed, 89 insertions(+), 50 deletions(-) > > > > (...) > > > -static int vfio_ap_mdev_remove(struct mdev_device *mdev) > > +static void vfio_ap_mdev_remove(struct mdev_device *mdev) > > { > > - struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev); > > + struct ap_matrix_mdev *matrix_mdev = dev_get_drvdata(&mdev->dev); > > > > - if (matrix_mdev->kvm) > > - return -EBUSY; > > + /* FIXME: Remove isn't allowed to fail */ > > + if (WARN_ON(matrix_mdev->kvm)) > > + return; > > This is a pre-existing problem, but the rework now makes it more > obvious. > > Previously, the mdev code would only print a warning and then continue > with device removal, even if a ->remove() callback returned an > error. This does mostly the same, the warning was just moved from mdev_device_remove_common() to here and changed to a WARN_ON() because it means we are permanently leaking kernel memory. I think in this case the vfio_device is not deleted - though I could re-order this to make that happen. > Now, it's quite clear that we'll end up in a weird half-dead state. I don't think it changes, after we print the WARN_ON we return to the driver core which does the same put_device()/etc as mdev_device_remove_common() was doing. > IIRC, the check for matrix_mdev->kvm is intended to protect against > ripping out the device under a running guest (I think it needs to > manipulate some crypto control blocks?) In that case it is missing locking too. Jason