Re: [PATCH v3 6/8] vfio: Invoke runtime PM API for IOCTL request

Alex Williamson <alex.williamson@xxxxxxxxxx> · Mon, 9 May 2022 16:30:02 -0600

On Thu, 5 May 2022 15:10:43 +0530
Abhishek Sahu <abhsahu@xxxxxxxxxx> wrote:

> On 5/5/2022 1:12 AM, Alex Williamson wrote:
> > On Mon, 25 Apr 2022 14:56:13 +0530
> > Abhishek Sahu <abhsahu@xxxxxxxxxx> wrote:
> >   
> >> The vfio/pci driver will have runtime power management support where the
> >> user can put the device low power state and then PCI devices can go into
> >> the D3cold state. If the device is in low power state and user issues any
> >> IOCTL, then the device should be moved out of low power state first. Once
> >> the IOCTL is serviced, then it can go into low power state again. The
> >> runtime PM framework manages this with help of usage count. One option
> >> was to add the runtime PM related API's inside vfio/pci driver but some
> >> IOCTL (like VFIO_DEVICE_FEATURE) can follow a different path and more
> >> IOCTL can be added in the future. Also, the runtime PM will be
> >> added for vfio/pci based drivers variant currently but the other vfio
> >> based drivers can use the same in the future. So, this patch adds the
> >> runtime calls runtime related API in the top level IOCTL function itself.
> >>
> >> For the vfio drivers which do not have runtime power management support
> >> currently, the runtime PM API's won't be invoked. Only for vfio/pci
> >> based drivers currently, the runtime PM API's will be invoked to increment
> >> and decrement the usage count. Taking this usage count incremented while
> >> servicing IOCTL will make sure that user won't put the device into low
> >> power state when any other IOCTL is being serviced in parallel.
> >>
> >> Signed-off-by: Abhishek Sahu <abhsahu@xxxxxxxxxx>
> >> ---
> >>  drivers/vfio/vfio.c | 44 +++++++++++++++++++++++++++++++++++++++++---
> >>  1 file changed, 41 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
> >> index a4555014bd1e..4e65a127744e 100644
> >> --- a/drivers/vfio/vfio.c
> >> +++ b/drivers/vfio/vfio.c
> >> @@ -32,6 +32,7 @@
> >>  #include <linux/vfio.h>
> >>  #include <linux/wait.h>
> >>  #include <linux/sched/signal.h>
> >> +#include <linux/pm_runtime.h>
> >>  #include "vfio.h"
> >>  
> >>  #define DRIVER_VERSION	"0.3"
> >> @@ -1536,6 +1537,30 @@ static const struct file_operations vfio_group_fops = {
> >>  	.release	= vfio_group_fops_release,
> >>  };
> >>  
> >> +/*
> >> + * Wrapper around pm_runtime_resume_and_get().
> >> + * Return 0, if driver power management callbacks are not present i.e. the driver is not  
> > 
> > Mind the gratuitous long comment line here.
> >   
>  
>  Thanks Alex.
>  
>  That was a miss. I will fix this.
>  
> >> + * using runtime power management.
> >> + * Return 1 upon success, otherwise -errno  
> > 
> > Changing semantics vs the thing we're wrapping, why not provide a
> > wrapper for the `put` as well to avoid?  The only cases where we return
> > zero are just as easy to detect on the other side.
> >   
> 
>  Yes. Using wrapper function for put is better option.
>  I will make the changes.
> 
> >> + */
> >> +static inline int vfio_device_pm_runtime_get(struct device *dev)  
> > 
> > Given some of Jason's recent series, this should probably just accept a
> > vfio_device.
> >   
> 
>  Sorry. I didn't get this part.
> 
>  Do I need to change it to
> 
>  static inline int vfio_device_pm_runtime_get(struct vfio_device *device)
>  {
>     struct device *dev = device->dev;
>     ...
>  }

Yes.

> >> +{
> >> +#ifdef CONFIG_PM
> >> +	int ret;
> >> +
> >> +	if (!dev->driver || !dev->driver->pm)
> >> +		return 0;

I'm also wondering how we could ever get here with dev->driver == NULL.
If that were actually possible, the above would at best be racy.  It
also really seems like there ought to be a better test than the
driver->pm pointer to check if runtime pm is enabled, but I haven't
spotted it yet.

> >> +
> >> +	ret = pm_runtime_resume_and_get(dev);
> >> +	if (ret < 0)
> >> +		return ret;
> >> +
> >> +	return 1;
> >> +#else
> >> +	return 0;
> >> +#endif
> >> +}
> >> +
> >>  /*
> >>   * VFIO Device fd
> >>   */
> >> @@ -1845,15 +1870,28 @@ static long vfio_device_fops_unl_ioctl(struct file *filep,
> >>  				       unsigned int cmd, unsigned long arg)
> >>  {
> >>  	struct vfio_device *device = filep->private_data;
> >> +	int pm_ret, ret = 0;
> >> +
> >> +	pm_ret = vfio_device_pm_runtime_get(device->dev);
> >> +	if (pm_ret < 0)
> >> +		return pm_ret;  
> > 
> > I wonder if we might simply want to mask pm errors behind -EIO, maybe
> > with a rate limited dev_info().  My concern would be that we might mask
> > errnos that userspace has come to expect for certain ioctls.  Thanks,
> > 
> > Alex
> >   
> 
>   I need to do something like following. Correct ?
> 
>   ret = vfio_device_pm_runtime_get(device);
>   if (ret < 0) {
>      dev_info_ratelimited(device->dev, "vfio: runtime resume failed %d\n", ret);
>      return -EIO;
>   }

Yeah, though I'd welcome other thoughts here.  I don't necessarily like
the idea of squashing the errno, but at the same time, if
pm_runtime_resume_and_get() returns -EINVAL on user ioctl, that's not
really describing an invalid parameter relative to the ioctl itself.
Thanks,

Alex