On Mon, 29 Aug 2022 17:18:45 +0530 Abhishek Sahu <abhsahu@xxxxxxxxxx> wrote: > This is part 2 for the vfio-pci driver power management support. > Part 1 of this patch series was related to adding D3cold support > when there is no user of the VFIO device and has already merged in the > mainline kernel. If we enable the runtime power management for > vfio-pci device in the guest OS, then the device is being runtime > suspended (for linux guest OS) and the PCI device will be put into > D3hot state (in function vfio_pm_config_write()). If the D3cold > state can be used instead of D3hot, then it will help in saving > maximum power. The D3cold state can't be possible with native > PCI PM. It requires interaction with platform firmware which is > system-specific. To go into low power states (Including D3cold), > the runtime PM framework can be used which internally interacts > with PCI and platform firmware and puts the device into the > lowest possible D-States. > > This patch series adds the support to engage runtime power management > initiated by the user. Since D3cold state can't be achieved by writing > PCI standard PM config registers, so new device features have been > added in DEVICE_FEATURE IOCTL for low power entry and exit related > handling. For the PCI device, this low power state will be D3cold > (if the platform supports the D3cold state). The hypervisors can implement > virtual ACPI methods to make the integration with guest OS. > For example, in guest Linux OS if PCI device ACPI node has > _PR3 and _PR0 power resources with _ON/_OFF method, then guest > Linux OS makes the _OFF call during D3cold transition and > then _ON during D0 transition. The hypervisor can tap these virtual > ACPI calls and then do the low power related IOCTL. > > The entry device feature has two variants. These two variants are mainly > to support the different behaviour for the low power entry. > If there is any access for the VFIO device on the host side, then the > device will be moved out of the low power state without the user's > guest driver involvement. Some devices (for example NVIDIA VGA or > 3D controller) require the user's guest driver involvement for > each low-power entry. In the first variant, the host can move the > device into low power without any guest driver involvement while > in the second variant, the host will send a notification to user > through eventfd and then user guest driver needs to move the device > into low power. The hypervisor can implement the virtual PME > support to notify the guest OS. Please refer > https://lore.kernel.org/lkml/20220701110814.7310-7-abhsahu@xxxxxxxxxx/ > where initially this virtual PME was implemented in the vfio-pci driver > itself, but later-on, it has been decided that hypervisor can implement > this. > > * Changes in v7 Applied to vfio next branch for v6.1. Thanks, Alex