Re: [PATCH v7 2/2] PCI/PM: Disable PTM on all devices

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Sorry for the delay here.

On Sat, Jun 11, 2022 at 2:12 AM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote:
>
> On Wed, Jun 08, 2022 at 05:10:07PM -0700, Rajvi Jingar wrote:
> > On receiving a PTM Request from a downstream device, if PTM is disabled
> > on the root port, as per PCIe specification, such request would cause
> > an Unsupported Request error. So disable PTM for any downstream devices.
> > PTM state needs to be saved before disabling it to be restored later.
> >
> > Set ptm_enabled from 'struct pci_dev' to 0 in pci_ptm_disable() and
> > it is used in pci_save_state() before saving PTM state to avoid
> > double save.
> >
> > Fixes: a697f072f5da ("PCI: Disable PTM during suspend to save power")
> > Signed-off-by: Rajvi Jingar <rajvi.jingar@xxxxxxxxxxxxxxx>
> > Suggested-by: David E. Box <david.e.box@xxxxxxxxxxxxxxx>
> > ---
> >  v1 -> v2: add Fixes tag in commit message
> >  v2 -> v3: move changelog after "---" marker
> >  v3 -> v4: add "---" marker after changelog
> >  v4 -> v5: move pci_disable_ptm() out of the pci_dev->state_saved check.
> >          disable PTM for all devices, not just root ports.
> >  v5 -> v6: move pci_disable_ptm() to pci_pm_suspend()
> >          set pci_dev->ptm_enabled to 0 in pci_ptm_disable() and it is
> >          used in pci_save_state() before saving PTM state to avoid
> >          double save.
> >  v6 -> v7: add #ifdef CONFIG_PCIE_PTM in pci_save_state() before saving
> >          PTM state
> > ---
> >  drivers/pci/pci-driver.c | 21 ++++++++++++++++++++-
> >  drivers/pci/pci.c        | 28 +++++++++++++---------------
> >  drivers/pci/pcie/ptm.c   |  1 +
> >  3 files changed, 34 insertions(+), 16 deletions(-)
> >
> > diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> > index 1f64de3e5280..db4d7835d7ae 100644
> > --- a/drivers/pci/pci-driver.c
> > +++ b/drivers/pci/pci-driver.c
> > @@ -803,14 +803,33 @@ static int pci_pm_suspend(struct device *dev)
> >               pci_dev_adjust_pme(pci_dev);
> >       }
> >
> > +     /*
> > +      * If a PTM Requester is put in a low-power state, a PTM Responder
> > +      * upstream from it may also be put in a low-power state. Putting a
> > +      * Port in D1, D2, or D3hot does not prohibit it from sending or
> > +      * responding to PTM Requests. We want to disable PTM on Responders
> > +      * when they are in a low-power state. Per 6.21.3, a PTM Requester
> > +      * must not be enabled when the upstream PTM Responder is disabled.
> > +      * Therefore, we must disable all PTM on all downstream PTM
> > +      * Requesters before disabling it on the PTM Responder, e.g., a Root
> > +      * Port.
> > +      *
> > +      * Also, to restore the PTM state, it needs to be saved before
> > +      * disabling it for all devices.
> > +      */
> > +     pci_save_ptm_state(pci_dev);
> > +     pci_disable_ptm(pci_dev);
>
> I think this is a little bit too magical.  The PTM disable doesn't
> really fit here in pci_pm_suspend().  It's more like the wakeup
> configuration done by pci_pm_suspend_noirq() in
> pci_prepare_to_sleep().
>
> IIUC, the reason it's here in pci_pm_suspend() is because of the weird
> nvme thing where nvme_suspend() puts the device in a device-specific
> low-power flavor of D0 and subsequent config accesses take it out of
> that low-power situation [1].
>
> I don't think this is a maintainable situation because there's nothing
> about this pci_disable_ptm() that says "this cannot be done after
> pm->suspend()".  That's a completely nvme-specific thing that we can't
> deduce from the code and are likely to break in the future.

Well, I'm not sure it is nvme-specific really.

Pretty much the same goes for any driver that wants to do their own
power management (whatever it is) in the ->suspend() callback and
indicate that by calling pci_save_state() by itself.

> We *do* have the rule that if the driver sets pdev->state_saved
> (normally by calling pci_save_state()), it means the driver is
> responsible for *all* the device state, even the standard config space
> that the PCI core would normally handle.
>
> When the driver does set pdev->state_saved, I don't think
> pci_pm_suspend_noirq() actually touches the device itself, and I'm a
> little more comfortable relying on that assumption.

It can be relied on right now which is also why the $subject patch
cannot put the PTM disabling in there and do it regardless of the
state_saved value.

> If this nvme weirdness plays a part here, I think the commit log and
> probably a comment really should mention what's going on because it's
> just feels fragile.

Totally agree on that one.

> [1] https://lore.kernel.org/r/CAJZ5v0iNaAd=yP3DgDVVpffKU6kt+nSpPeqxWJyRddaX5K4FRA@xxxxxxxxxxxxxx
>
> >       if (pm->suspend) {
> >               pci_power_t prev = pci_dev->current_state;
> >               int error;
> >
> >               error = pm->suspend(dev);
> >               suspend_report_result(dev, pm->suspend, error);
> > -             if (error)
> > +             if (error) {
> > +                     pci_restore_ptm_state(pci_dev);
> >                       return error;
> > +             }
> >
> >               if (!pci_dev->state_saved && pci_dev->current_state != PCI_D0
> >                   && pci_dev->current_state != PCI_UNKNOWN) {
> > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > index cfaf40a540a8..3e9dcb1bbffa 100644
> > --- a/drivers/pci/pci.c
> > +++ b/drivers/pci/pci.c
> > @@ -1669,7 +1669,15 @@ int pci_save_state(struct pci_dev *dev)
> >       pci_save_ltr_state(dev);
> >       pci_save_dpc_state(dev);
> >       pci_save_aer_state(dev);
> > -     pci_save_ptm_state(dev);
> > +#ifdef CONFIG_PCIE_PTM
> > +     /*
> > +      * PCI PM core disables PTM during suspend and saves PTM state before
> > +      * that to be able to restore the ptm state restored later. So PCI core
> > +      * needs this check to avoid double save.
> > +      */
> > +     if (dev->ptm_enabled)
> > +             pci_save_ptm_state(dev);
> > +#endif
>
> This ptm_enabled check doesn't fit with the rest of the function and
> the semantics are fairly complicated.
>
> >       return pci_save_vc_state(dev);
> >  }
> >  EXPORT_SYMBOL(pci_save_state);
> > @@ -2710,24 +2718,12 @@ int pci_prepare_to_sleep(struct pci_dev *dev)
> >       if (target_state == PCI_POWER_ERROR)
> >               return -EIO;
> >
> > -     /*
> > -      * There are systems (for example, Intel mobile chips since Coffee
> > -      * Lake) where the power drawn while suspended can be significantly
> > -      * reduced by disabling PTM on PCIe root ports as this allows the
> > -      * port to enter a lower-power PM state and the SoC to reach a
> > -      * lower-power idle state as a whole.
> > -      */
> > -     if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT)
> > -             pci_disable_ptm(dev);
> > -
> >       pci_enable_wake(dev, target_state, wakeup);
> >
> >       error = pci_set_power_state(dev, target_state);
> >
> > -     if (error) {
> > +     if (error)
> >               pci_enable_wake(dev, target_state, false);
> > -             pci_restore_ptm_state(dev);
> > -     }
> >
> >       return error;
> >  }
> > @@ -2775,8 +2771,10 @@ int pci_finish_runtime_suspend(struct pci_dev *dev)
> >        * port to enter a lower-power PM state and the SoC to reach a
> >        * lower-power idle state as a whole.
> >        */
> > -     if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT)
> > +     if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT) {
> > +             pci_save_ptm_state(dev);
> >               pci_disable_ptm(dev);
> > +     }
> >
> >       __pci_enable_wake(dev, target_state, pci_dev_run_wake(dev));
> >
> > diff --git a/drivers/pci/pcie/ptm.c b/drivers/pci/pcie/ptm.c
> > index 368a254e3124..746e29779c27 100644
> > --- a/drivers/pci/pcie/ptm.c
> > +++ b/drivers/pci/pcie/ptm.c
> > @@ -44,6 +44,7 @@ void pci_disable_ptm(struct pci_dev *dev)
> >       pci_read_config_word(dev, ptm + PCI_PTM_CTRL, &ctrl);
> >       ctrl &= ~(PCI_PTM_CTRL_ENABLE | PCI_PTM_CTRL_ROOT);
> >       pci_write_config_word(dev, ptm + PCI_PTM_CTRL, ctrl);
> > +     dev->ptm_enabled = 0;
>
> This looks like a bug fix that could be in a separate patch.
>
> >  }
> >
> >  void pci_save_ptm_state(struct pci_dev *dev)
>
>
> I think something like the sketch below would fit better in the power
> management framework.  PTM disable is closely related to device power
> states, so I tried to put it as close as possible to the power state
> transitions.  I'm sure there are things missing and things I'm
> overlooking:

There are PCI devices that pci_prepare_to_sleep() is not called for,
so disabling PTM in there may not work in general.

> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index cfaf40a540a8..4dcd0c7381b9 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -2705,28 +2705,21 @@ int pci_prepare_to_sleep(struct pci_dev *dev)
>  {
>         bool wakeup = device_may_wakeup(&dev->dev);
>         pci_power_t target_state = pci_target_state(dev, wakeup);
> +       bool ptm = pcie_ptm_enabled(dev);
>         int error;
>
>         if (target_state == PCI_POWER_ERROR)
>                 return -EIO;
>
> -       /*
> -        * There are systems (for example, Intel mobile chips since Coffee
> -        * Lake) where the power drawn while suspended can be significantly
> -        * reduced by disabling PTM on PCIe root ports as this allows the
> -        * port to enter a lower-power PM state and the SoC to reach a
> -        * lower-power idle state as a whole.
> -        */
> -       if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT)
> -               pci_disable_ptm(dev);
> -
> +       pci_disable_ptm(dev);
>         pci_enable_wake(dev, target_state, wakeup);
>
>         error = pci_set_power_state(dev, target_state);
>
>         if (error) {
>                 pci_enable_wake(dev, target_state, false);
> -               pci_restore_ptm_state(dev);
> +               if (ptm)
> +                       pci_enable_ptm(dev);
>         }
>
>         return error;
> @@ -2762,6 +2755,7 @@ EXPORT_SYMBOL(pci_back_from_sleep);
>  int pci_finish_runtime_suspend(struct pci_dev *dev)
>  {
>         pci_power_t target_state;
> +       bool ptm = pcie_ptm_enabled(dev);
>         int error;
>
>         target_state = pci_target_state(dev, device_can_wakeup(&dev->dev));
> @@ -2778,13 +2772,15 @@ int pci_finish_runtime_suspend(struct pci_dev *dev)
>         if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT)
>                 pci_disable_ptm(dev);
>
> +       pci_disable_ptm(dev);
>         __pci_enable_wake(dev, target_state, pci_dev_run_wake(dev));
>
>         error = pci_set_power_state(dev, target_state);
>
>         if (error) {
>                 pci_enable_wake(dev, target_state, false);
> -               pci_restore_ptm_state(dev);
> +               if (ptm)
> +                       pci_enable_ptm(dev);
>         }
>
>         return error;



[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux