On Tue, 26 Mar 2019 05:53:22 +0000 Parav Pandit <parav@xxxxxxxxxxxx> wrote: > > -----Original Message----- > > From: linux-kernel-owner@xxxxxxxxxxxxxxx <linux-kernel- > > owner@xxxxxxxxxxxxxxx> On Behalf Of Parav Pandit > > Sent: Monday, March 25, 2019 10:19 PM > > To: Alex Williamson <alex.williamson@xxxxxxxxxx> > > Cc: kvm@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; > > kwankhede@xxxxxxxxxx > > Subject: RE: [PATCH 8/8] vfio/mdev: Improve the create/remove sequence > > > > > > > > > -----Original Message----- > > > From: Alex Williamson <alex.williamson@xxxxxxxxxx> > > > Sent: Monday, March 25, 2019 9:17 PM > > > To: Parav Pandit <parav@xxxxxxxxxxxx> > > > Cc: kvm@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; > > > kwankhede@xxxxxxxxxx > > > Subject: Re: [PATCH 8/8] vfio/mdev: Improve the create/remove sequence > > > > > > On Tue, 26 Mar 2019 01:43:44 +0000 > > > Parav Pandit <parav@xxxxxxxxxxxx> wrote: > > > > > > > > -----Original Message----- > > > > > From: Alex Williamson <alex.williamson@xxxxxxxxxx> > > > > > > I mean the callback iterator on the parent remove can do a WARN_ON > > > > > if this returns an error while the device remove path can silently > > > > > return -EBUSY, the common function doesn't need to decide whether > > > > > the parent ops remove function deserves a dev_err. > > > > > > > > > Ok. I understood. > > > > But device remove returning silent -EBUSY looks an error that should > > > > get logged in, because this is something not expected. Its probably > > > > late for sysfs layer to return report an error by that time it > > > > prints device name, because put_device() is done. So if remove() > > > > returns an error, I think its legitimate failure to do WARN_ON or > > dev_err(). > > > > > > Calling put_device() if the parent remove op fails looks like a bug > > > introduced by this series, the current code allows that failure > > > leaving the device in a coherent state and returning errno to the sysfs > > store function. > > > > > Why should it fail? > > We are taking off the device bus first as describe in commit log. > > This ensures that everything is closed before calling the remove(). > > We cannot avoid put_device() and put_parent, it all buggy path... > > I audited remove() callbacks of kvmgt.c, vfio_ccw_ops.c, > vfio_ap_ops.c, mbochs.c, mdpy.c, mtty.c, who makes the remove > possible once the device release is executed. This should complete > once the device is taken off the bus. This was not the case before > this sequence where remove() is done while device is open...hence the > check was needed in past. dev_err() is to help catch any errors/bugs > in this area. > > I doubt we need to retry remove() like vfio_del_group_dev(), in > mdev_core if release() is not yet complete. I'm ok with this, I've always thought the 'force' semantics and allowing remove to fail were not terribly inline with other drivers, even if ultimately I wish drivers could nak a remove request to avoid the ugliness of blocking. But ultimately you'll need to come to an agreement with Kirti, the drivers we have in-tree are not the complete set of mdev drivers, but it also doesn't necessarily make sense to cater to the lone out-of-tree driver either. Thanks, Alex