On 11/17/2021 11:23 PM, Alex Williamson wrote: > On Mon, 15 Nov 2021 19:06:39 +0530 > <abhsahu@xxxxxxxxxx> wrote: > >> From: Abhishek Sahu <abhsahu@xxxxxxxxxx> >> >> If any PME event will be generated by PCI, then it will be mostly >> handled in the host by the root port PME code. For example, in the case >> of PCIe, the PME event will be sent to the root port and then the PME >> interrupt will be generated. This will be handled in >> drivers/pci/pcie/pme.c at the host side. Inside this, the >> pci_check_pme_status() will be called where PME_Status and PME_En bits >> will be cleared. So, the guest OS which is using vfio-pci device will >> not come to know about this PME event. >> >> To handle these PME events inside guests, we need some framework so >> that if any PME events will happen, then it needs to be forwarded to >> virtual machine monitor. We can virtualize PME related registers bits >> and initialize these bits to zero so vfio-pci device user will assume >> that it is not capable of asserting the PME# signal from any power state. >> >> Signed-off-by: Abhishek Sahu <abhsahu@xxxxxxxxxx> >> --- >> drivers/vfio/pci/vfio_pci_config.c | 32 +++++++++++++++++++++++++++++- >> 1 file changed, 31 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/vfio/pci/vfio_pci_config.c b/drivers/vfio/pci/vfio_pci_config.c >> index 6e58b4bf7a60..fb3a503a5b99 100644 >> --- a/drivers/vfio/pci/vfio_pci_config.c >> +++ b/drivers/vfio/pci/vfio_pci_config.c >> @@ -738,12 +738,27 @@ static int __init init_pci_cap_pm_perm(struct perm_bits *perm) >> */ >> p_setb(perm, PCI_CAP_LIST_NEXT, (u8)ALL_VIRT, NO_WRITE); >> >> + /* >> + * The guests can't process PME events. If any PME event will be >> + * generated, then it will be mostly handled in the host and the >> + * host will clear the PME_STATUS. So virtualize PME_Support bits. >> + * It will be initialized to zero later on. >> + */ >> + p_setw(perm, PCI_PM_PMC, PCI_PM_CAP_PME_MASK, NO_WRITE); >> + >> /* >> * Power management is defined *per function*, so we can let >> * the user change power state, but we trap and initiate the >> * change ourselves, so the state bits are read-only. >> + * >> + * The guest can't process PME from D3cold so virtualize PME_Status >> + * and PME_En bits. It will be initialized to zero later on. >> */ >> - p_setd(perm, PCI_PM_CTRL, NO_VIRT, ~PCI_PM_CTRL_STATE_MASK); >> + p_setd(perm, PCI_PM_CTRL, >> + PCI_PM_CTRL_PME_ENABLE | PCI_PM_CTRL_PME_STATUS, >> + ~(PCI_PM_CTRL_PME_ENABLE | PCI_PM_CTRL_PME_STATUS | >> + PCI_PM_CTRL_STATE_MASK)); >> + >> return 0; >> } >> >> @@ -1412,6 +1427,18 @@ static int vfio_ext_cap_len(struct vfio_pci_core_device *vdev, u16 ecap, u16 epo >> return 0; >> } >> >> +static void vfio_update_pm_vconfig_bytes(struct vfio_pci_core_device *vdev, >> + int offset) >> +{ >> + /* initialize virtualized PME_Support bits to zero */ >> + *(__le16 *)&vdev->vconfig[offset + PCI_PM_PMC] &= >> + ~cpu_to_le16(PCI_PM_CAP_PME_MASK); >> + >> + /* initialize virtualized PME_Status and PME_En bits to zero */ > > ^ Extra space here and above. > > >> + *(__le16 *)&vdev->vconfig[offset + PCI_PM_CTRL] &= >> + ~cpu_to_le16(PCI_PM_CTRL_PME_ENABLE | PCI_PM_CTRL_PME_STATUS); > > Perhaps more readable and consistent with elsewhere as: > > __le16 *pmc = (__le16 *)&vdev->vconfig[offset + PCI_PM_PMC]; > __le16 *ctrl = (__le16 *)&vdev->vconfig[offset + PCI_PM_CTRL]; > > /* Clear vconfig PME_Support, PME_Status, and PME_En bits */ > *pmc &= ~cpu_to_le16(PCI_PM_CAP_PME_MASK); > *ctrl &= ~cpu_to_le16(PCI_PM_CTRL_PME_ENABLE | PCI_PM_CTRL_PME_STATUS); > > Thanks, > Alex > Thanks Alex. I will fix this. Regards, Abhishek >> +} >> + >> static int vfio_fill_vconfig_bytes(struct vfio_pci_core_device *vdev, >> int offset, int size) >> { >> @@ -1535,6 +1562,9 @@ static int vfio_cap_init(struct vfio_pci_core_device *vdev) >> if (ret) >> return ret; >> >> + if (cap == PCI_CAP_ID_PM) >> + vfio_update_pm_vconfig_bytes(vdev, pos); >> + >> prev = &vdev->vconfig[pos + PCI_CAP_LIST_NEXT]; >> pos = next; >> caps++; >