Re: [PATCH v5 1/5] vfio: Add the device features for the low power entry and exit

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 26 Jul 2022 18:17:18 +0530
Abhishek Sahu <abhsahu@xxxxxxxxxx> wrote:

> On 7/26/2022 3:39 AM, Alex Williamson wrote:
> > On Mon, 25 Jul 2022 20:10:44 +0530
> > Abhishek Sahu <abhsahu@xxxxxxxxxx> wrote:
> >   
> >> On 7/22/2022 4:04 AM, Alex Williamson wrote:  
> >>> On Tue, 19 Jul 2022 17:45:19 +0530
> >>> Abhishek Sahu <abhsahu@xxxxxxxxxx> wrote:
> >>>     
> >>>> This patch adds the following new device features for the low
> >>>> power entry and exit in the header file. The implementation for the
> >>>> same will be added in the subsequent patches.
> >>>>
> >>>> - VFIO_DEVICE_FEATURE_LOW_POWER_ENTRY
> >>>> - VFIO_DEVICE_FEATURE_LOW_POWER_ENTRY_WITH_WAKEUP
> >>>> - VFIO_DEVICE_FEATURE_LOW_POWER_EXIT
> >>>>
> >>>> With the standard registers, all power states cannot be achieved. The    
> >>>
> >>> We're talking about standard PCI PM registers here, let's make that
> >>> clear since we're adding a device agnostic interface here.
> >>>     
> >>>> platform-based power management needs to be involved to go into the
> >>>> lowest power state. For doing low power entry and exit with
> >>>> platform-based power management, these device features can be used.
> >>>>
> >>>> The entry device feature has two variants. These two variants are mainly
> >>>> to support the different behaviour for the low power entry.
> >>>> If there is any access for the VFIO device on the host side, then the
> >>>> device will be moved out of the low power state without the user's
> >>>> guest driver involvement. Some devices (for example NVIDIA VGA or
> >>>> 3D controller) require the user's guest driver involvement for
> >>>> each low-power entry. In the first variant, the host can move the
> >>>> device into low power without any guest driver involvement while    
> >>>
> >>> Perhaps, "In the first variant, the host can return the device to low
> >>> power automatically.  The device will continue to attempt to reach low
> >>> power until the low power exit feature is called."
> >>>     
> >>>> in the second variant, the host will send a notification to the user
> >>>> through eventfd and then the users guest driver needs to move
> >>>> the device into low power.    
> >>>
> >>> "In the second variant, if the device exits low power due to an access,
> >>> the host kernel will signal the user via the provided eventfd and will
> >>> not return the device to low power without a subsequent call to one of
> >>> the low power entry features.  A call to the low power exit feature is
> >>> optional if the user provided eventfd is signaled."
> >>>      
> >>>> These device features only support VFIO_DEVICE_FEATURE_SET operation.    
> >>>
> >>> And PROBE.
> >>>     
> >>
> >>  Thanks Alex.
> >>  I will make the above changes in the commit message.
> >>  
> >>>> Signed-off-by: Abhishek Sahu <abhsahu@xxxxxxxxxx>
> >>>> ---
> >>>>  include/uapi/linux/vfio.h | 55 +++++++++++++++++++++++++++++++++++++++
> >>>>  1 file changed, 55 insertions(+)
> >>>>
> >>>> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> >>>> index 733a1cddde30..08fd3482d22b 100644
> >>>> --- a/include/uapi/linux/vfio.h
> >>>> +++ b/include/uapi/linux/vfio.h
> >>>> @@ -986,6 +986,61 @@ enum vfio_device_mig_state {
> >>>>  	VFIO_DEVICE_STATE_RUNNING_P2P = 5,
> >>>>  };
> >>>>  
> >>>> +/*
> >>>> + * Upon VFIO_DEVICE_FEATURE_SET, move the VFIO device into the low power state
> >>>> + * with the platform-based power management.  This low power state will be    
> >>>
> >>> This is really "allow the device to be moved into a low power state"
> >>> rather than actually "move the device into" such a state though, right?
> >>>     
> >>  
> >>  Yes. It will just allow the device to be moved into a low power state.
> >>  I have addressed all your suggestions in the uAPI description and
> >>  added the updated description in the last.
> >>
> >>  Can you please check that once and check if it looks okay.
> >>    
> >>>> + * internal to the VFIO driver and the user will not come to know which power
> >>>> + * state is chosen.  If any device access happens (either from the host or
> >>>> + * the guest) when the device is in the low power state, then the host will
> >>>> + * move the device out of the low power state first.  Once the access has been
> >>>> + * finished, then the host will move the device into the low power state again.
> >>>> + * If the user wants that the device should not go into the low power state
> >>>> + * again in this case, then the user should use the
> >>>> + * VFIO_DEVICE_FEATURE_LOW_POWER_ENTRY_WITH_WAKEUP device feature for the    
> >>>
> >>> This should probably just read "For single shot low power support with
> >>> wake-up notification, see
> >>> VFIO_DEVICE_FEATURE_LOW_POWER_ENTRY_WITH_WAKEUP below."
> >>>     
> >>>> + * low power entry.  The mmap'ed region access is not allowed till the low power
> >>>> + * exit happens through VFIO_DEVICE_FEATURE_LOW_POWER_EXIT and will
> >>>> + * generate the access fault.
> >>>> + */
> >>>> +#define VFIO_DEVICE_FEATURE_LOW_POWER_ENTRY 3    
> >>>
> >>> Note that Yishai's DMA logging set is competing for the same feature
> >>> entries.  We'll need to coordinate.
> >>>     
> >>
> >>  Since this is last week of rc, so will it possible to consider the updated
> >>  patches for next kernel (I can try to send them after complete testing the
> >>  by the end of this week). Otherwise, I can wait for next kernel and then
> >>  I can rebase my patches.  
> > 
> > I think we're close, though I would like to solicit uAPI feedback from
> > others on the list.  We can certainly sort out feature numbers
> > conflicts at commit time if Yishai's series becomes ready as well.  I'm
> > on PTO for the rest of the week starting tomorrow afternoon, but I'd
> > suggest trying to get this re-posted before the end of the week, asking
> > for more reviews for the uAPI, and we can evaluate early next week if
> > this is ready.
> >   
> 
>  Sure. I will re-post the updated patch series after the complete testing.
>  
> >>>> +
> >>>> +/*
> >>>> + * Upon VFIO_DEVICE_FEATURE_SET, move the VFIO device into the low power state
> >>>> + * with the platform-based power management and provide support for the wake-up
> >>>> + * notifications through eventfd.  This low power state will be internal to the
> >>>> + * VFIO driver and the user will not come to know which power state is chosen.
> >>>> + * If any device access happens (either from the host or the guest) when the
> >>>> + * device is in the low power state, then the host will move the device out of
> >>>> + * the low power state first and a notification will be sent to the guest
> >>>> + * through eventfd.  Once the access is finished, the host will not move back
> >>>> + * the device into the low power state.  The guest should move the device into
> >>>> + * the low power state again upon receiving the wakeup notification.  The
> >>>> + * notification will be generated only if the device physically went into the
> >>>> + * low power state.  If the low power entry has been disabled from the host
> >>>> + * side, then the device will not go into the low power state even after
> >>>> + * calling this device feature and then the device access does not require
> >>>> + * wake-up.  The mmap'ed region access is not allowed till the low power exit
> >>>> + * happens.  The low power exit can happen either through
> >>>> + * VFIO_DEVICE_FEATURE_LOW_POWER_EXIT or through any other access (where the
> >>>> + * wake-up notification has been generated).    
> >>>
> >>> Seems this could leverage a lot more from the previous, simply stating
> >>> that this has the same behavior as VFIO_DEVICE_FEATURE_LOW_POWER_ENTRY
> >>> with the exception that the user provides an eventfd for notification
> >>> when the device resumes from low power and will not try to re-enter a
> >>> low power state without a subsequent user call to one of the low power
> >>> entry feature ioctls.  It might also be worth covering the fact that
> >>> device accesses by the user, including region and ioctl access, will
> >>> also trigger the eventfd if that access triggers a resume.
> >>>
> >>> As I'm thinking about this, that latter clause is somewhat subtle.
> >>> AIUI a user can call the low power entry with wakeup feature and
> >>> proceed to do various ioctl and region (not mmap) accesses that could
> >>> perpetually keep the device awake, or there may be dependent devices
> >>> such that the device may never go to low power.  It needs to be very
> >>> clear that only if the wakeup eventfd has the device entered into and
> >>> exited a low power state making the low power exit ioctl optional.
> >>>     
> >>
> >>  Yes. In my updated description, I have added more details.
> >>  Can you please check if that helps.
> >>  
> >>>> + */
> >>>> +struct vfio_device_low_power_entry_with_wakeup {
> >>>> +	__s32 wakeup_eventfd;
> >>>> +	__u32 reserved;
> >>>> +};
> >>>> +
> >>>> +#define VFIO_DEVICE_FEATURE_LOW_POWER_ENTRY_WITH_WAKEUP 4
> >>>> +
> >>>> +/*
> >>>> + * Upon VFIO_DEVICE_FEATURE_SET, move the VFIO device out of the low power
> >>>> + * state.    
> >>>
> >>> Any ioctl effectively does that, the key here is that the low power
> >>> state may not be re-entered after this ioctl.
> >>>     
> >>>>  This device feature should be called only if the user has previously
> >>>> + * put the device into the low power state either with
> >>>> + * VFIO_DEVICE_FEATURE_LOW_POWER_ENTRY or
> >>>> + * VFIO_DEVICE_FEATURE_LOW_POWER_ENTRY_WITH_WAKEUP device feature.  If the    
> >>>
> >>> Doesn't really seem worth mentioning, we need to protect against misuse
> >>> regardless.
> >>>     
> >>>> + * device is not in the low power state currently, this device feature will
> >>>> + * return early with the success status.    
> >>>
> >>> This is an implementation detail, it doesn't need to be part of the
> >>> uAPI.  Thanks,
> >>>
> >>> Alex
> >>>     
> >>>> + */
> >>>> +#define VFIO_DEVICE_FEATURE_LOW_POWER_EXIT 5
> >>>> +
> >>>>  /* -------- API for Type1 VFIO IOMMU -------- */
> >>>>  
> >>>>  /**    
> >>>     
> >>
> >>  /*
> >>   * Upon VFIO_DEVICE_FEATURE_SET, allow the device to be moved into a low power
> >>   * state with the platform-based power management.  This low power state will
> >>   * be internal to the VFIO driver and the user will not come to know which
> >>   * power state is chosen.  If any device access happens (either from the host  
> > 
> > Couldn't the user look in sysfs to determine the power state?  There
> > might also be external hardware monitors of the power state, so this
> > statement is a bit overreaching.  Maybe something along the lines of...
> > 
> > "Device use of lower power states depend on factors managed by the
> > runtime power management core, including system level support and
> > coordinating support among dependent devices.  Enabling device low
> > power entry does not guarantee lower power usage by the device, nor is
> > a mechanism provided through this feature to know the current power
> > state of the device."
> >   
> >>   * or the guest) when the device is in the low power state, then the host will  
> > 
> > Let's not confine ourselves to a VM use case, "...from the host or
> > through the vfio uAPI".
> >   
> >>   * move the device out of the low power state first.  Once the access has been  
> > 
> > "move the device out of the low power state as necessary prior to the
> > access."
> >   
> >>   * finished, then the host will move the device into the low power state  
> > 
> > "Once the access is completed, the device may re-enter the low power
> > state."
> >   
> >>   * again.  For single shot low power support with wake-up notification, see
> >>   * VFIO_DEVICE_FEATURE_LOW_POWER_ENTRY_WITH_WAKEUP below.  The mmap'ed region
> >>   * access is not allowed till the low power exit happens through
> >>   * VFIO_DEVICE_FEATURE_LOW_POWER_EXIT and will generate the access fault.  
> > 
> > "Access to mmap'd device regions is disabled on LOW_POWER_ENTRY and
> > may only be resumed after calling LOW_POWER_EXIT."
> >   
> >>   */
> >>  #define VFIO_DEVICE_FEATURE_LOW_POWER_ENTRY 3
> >>
> >>  /*
> >>   * This device feature has the same behavior as
> >>   * VFIO_DEVICE_FEATURE_LOW_POWER_ENTRY with the exception that the user
> >>   * provides an eventfd for wake-up notification.  When the device moves out of
> >>   * the low power state for the wake-up, the host will not try to re-enter a  
> > 
> > "...will not allow the device to re-enter a low power state..."
> >   
> >>   * low power state without a subsequent user call to one of the low power
> >>   * entry device feature IOCTLs.  The low power exit can happen either through
> >>   * VFIO_DEVICE_FEATURE_LOW_POWER_EXIT or through any other access (where the
> >>   * wake-up notification has been generated).
> >>   *
> >>   * The notification through eventfd will be generated only if the device has
> >>   * gone into the low power state after calling this device feature IOCTL.  
> > 
> > "The notification through the provided eventfd will be generated only
> > when the device has entered and is resumed from a low power state after
> > calling this device feature IOCTL."
> > 
> >   
> >>   * There are various cases where the device will not go into the low power
> >>   * state after calling this device feature IOCTL (for example, the low power
> >>   * entry has been disabled from the host side, the user keeps the device busy
> >>   * after calling this device feature IOCTL, there are dependent devices which
> >>   * block the device low power entry, etc.) and in such cases, the device access
> >>   * does not require wake-up.  Also, the low power exit through  
> > 
> > "A device that has not entered low power state, as managed through the
> > runtime power management core, will not generate a notification through
> > the provided eventfd on access."
> >   
> >>   * VFIO_DEVICE_FEATURE_LOW_POWER_EXIT is mandatory for the cases where the
> >>   * wake-up notification has not been generated.  
> > 
> > "Calling the LOW_POWER_EXIT feature is optional in the case where
> > notification has been signaled on the provided eventfd that a resume
> > from low power has occurred."
> > 
> > We should also reiterate the statement above about mmap access because
> > it would be reasonable for a user to assume that if any access wakes
> > the device, that should include mmap faults and therefore a resume
> > notification should occur for such an event.  We could implement that
> > for the one-shot mode, but we're choosing not to.
> >   
> >>   */
> >>  struct vfio_device_low_power_entry_with_wakeup {
> >> 	 __s32 wakeup_eventfd;
> >> 	 __u32 reserved;
> >>  };
> >>
> >>  #define VFIO_DEVICE_FEATURE_LOW_POWER_ENTRY_WITH_WAKEUP 4
> >>
> >>  /*
> >>   * Upon VFIO_DEVICE_FEATURE_SET, prevent the VFIO device low power state entry
> >>   * which has been previously allowed with VFIO_DEVICE_FEATURE_LOW_POWER_ENTRY
> >>   * or VFIO_DEVICE_FEATURE_LOW_POWER_ENTRY_WITH_WAKEUP device features.
> >>   */  
> > 
> > "Upon VFIO_DEVICE_FEATURE_SET, disallow use of device low power states
> > as previously enabled via VFIO_DEVICE_FEATURE_LOW_POWER_ENTRY or
> > VFIO_DEVICE_FEATURE_LOW_POWER_ENTRY_WITH_WAKEUP device features.  This
> > device feature ioctl may itself generate a wakeup eventfd notification
> > in the latter case if the device has previously entered a low power
> > state."
> > 
> > Thanks,
> > Alex
> >   
> 
>  Thanks Alex for your thorough review of uAPI.
>  I have incorporated all the suggestions.
>  Following is the updated uAPI.
>  
>  /*
>   * Upon VFIO_DEVICE_FEATURE_SET, allow the device to be moved into a low power
>   * state with the platform-based power management.  Device use of lower power
>   * states depends on factors managed by the runtime power management core,
>   * including system level support and coordinating support among dependent
>   * devices.  Enabling device low power entry does not guarantee lower power
>   * usage by the device, nor is a mechanism provided through this feature to
>   * know the current power state of the device.  If any device access happens
>   * (either from the host or through the vfio uAPI) when the device is in the
>   * low power state, then the host will move the device out of the low power
>   * state as necessary prior to the access.  Once the access is completed, the
>   * device may re-enter the low power state.  For single shot low power support
>   * with wake-up notification, see
>   * VFIO_DEVICE_FEATURE_LOW_POWER_ENTRY_WITH_WAKEUP below.  Access to mmap'd
>   * device regions is disabled on LOW_POWER_ENTRY and may only be resumed after
>   * calling LOW_POWER_EXIT.
>   */
>  #define VFIO_DEVICE_FEATURE_LOW_POWER_ENTRY 3
>  
>  /*
>   * This device feature has the same behavior as
>   * VFIO_DEVICE_FEATURE_LOW_POWER_ENTRY with the exception that the user
>   * provides an eventfd for wake-up notification.  When the device moves out of
>   * the low power state for the wake-up, the host will not allow the device to
>   * re-enter a low power state without a subsequent user call to one of the low
>   * power entry device feature IOCTLs.  Access to mmap'd device regions is
>   * disabled on LOW_POWER_ENTRY_WITH_WAKEUP and may only be resumed after the
>   * low power exit.  The low power exit can happen either through LOW_POWER_EXIT
>   * or through any other access (where the wake-up notification has been
>   * generated).  The access to mmap'd device regions will not trigger low power
>   * exit.
>   *
>   * The notification through the provided eventfd will be generated only when
>   * the device has entered and is resumed from a low power state after
>   * calling this device feature IOCTL.  A device that has not entered low power
>   * state, as managed through the runtime power management core, will not
>   * generate a notification through the provided eventfd on access.  Calling the
>   * LOW_POWER_EXIT feature is optional in the case where notification has been
>   * signaled on the provided eventfd that a resume from low power has occurred.
>   */
>  struct vfio_device_low_power_entry_with_wakeup {
>  	__s32 wakeup_eventfd;
>  	__u32 reserved;
>  };
>  
>  #define VFIO_DEVICE_FEATURE_LOW_POWER_ENTRY_WITH_WAKEUP 4
>  
>  /*
>   * Upon VFIO_DEVICE_FEATURE_SET, disallow use of device low power states as
>   * previously enabled via VFIO_DEVICE_FEATURE_LOW_POWER_ENTRY or
>   * VFIO_DEVICE_FEATURE_LOW_POWER_ENTRY_WITH_WAKEUP device features.
>   * This device feature IOCTL may itself generate a wakeup eventfd notification
>   * in the latter case if the device has previously entered a low power state.
>   */
>  #define VFIO_DEVICE_FEATURE_LOW_POWER_EXIT 5

Looks ok to me.  Thanks,

Alex




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux