Re: [RFC][pci/pm] pci config space save restore issues during suspend/resume

Bjorn Helgaas <helgaas@xxxxxxxxxx> · Tue, 11 Feb 2020 07:50:43 -0600

On Tue, Feb 11, 2020 at 01:57:06PM +0800, Chen Yu wrote:
> Hi,
> We found two issues in the code during suspend:
> 
> 1. Andy Shevchenko found that, the save restore of pci config space
>     might cause potential issue. Current code uses
>     pci_read_config_dword() to read pci config header. However
>     hardware is not obliged to react correctly when trying to read
>     two/three 'adjacent' pci config registers with one dword read.
> 
>     Q1: Should we save/restore the pci config space header according
>     to the PCI spec strictly(pci_read_config_dword() for 32bit,
>     while pci_read_config_word() for 16bits, etc)?

I'm sure you know my first question will be for a spec reference for
this requirement that we read registers with the correct size :)  If
there is such a requirement, then of course we should follow it.

> 2. The pci config space of some problematic devices(or due to firmware
>     bug) might become inaccessible after resumed from S3(suspend to
>     mem) on VM.
> 
>     Q2: Should we do sanity check on pci config space before saving
>     them?  Say, invoke pci_dev_is_present() before suspend, if the
>     pci config space is not sane, bypass the config space saving
>     process, because there's no need to save invalid pci config
>     space.

I'm not in favor of a sanity check, at least not yet.  This sounds
like a problem that has not been debugged yet.  If the device is
broken in some way, maybe a quirk would be appropriate.  Otherwise,
maybe there's some Linux issue in the resume from S3 path that we
should fix.

Bjorn