On Tue, Jun 6, 2023 at 10:26 PM Limonciello, Mario <mario.limonciello@xxxxxxx> wrote: > > > On 6/6/2023 2:58 PM, Bjorn Helgaas wrote: > > On Tue, Jun 06, 2023 at 02:40:45PM -0500, Limonciello, Mario wrote: > >> On 6/6/2023 2:23 PM, Bjorn Helgaas wrote: > >>> On Tue, Jun 06, 2023 at 11:23:21AM -0500, Mario Limonciello wrote: > >>>> ASMedia PCIe GPIO controllers fail functional tests after returning from > >>>> suspend (S3 or s2idle). This is because the BIOS checks whether the > >>>> OSPM has called the `_REG` method to determine whether it can interact with > >>>> the OperationRegion assigned to the device. > >>>> > >>>> As described in 6.5.4 in the APCI spec, `_REG` is used to inform the AML > >>>> code on the availability of an operation region. > >>>> > >>>> To fix this issue, call acpi_evaluate_reg() when saving and restoring the > >>>> state of PCI devices. > >>>> > >>>> Link: https://uefi.org/htmlspecs/ACPI_Spec_6_4_html/06_Device_Configuration/Device_Configuration.html#reg-region > >>>> Signed-off-by: Mario Limonciello <mario.limonciello@xxxxxxx> > >>>> --- > >>>> v1->v2: > >>>> * Handle case of no CONFIG_ACPI > >>>> * Rename function > >>>> * Update commit message > >>>> * Move ACPI calling code into pci-acpi.c instead > >>>> * Cite the ACPI spec > >>> Thanks for the spec reference (s/APCI/ACPI/ and add the revision if > >>> you rev this (r6.5 is the latest, AFAIK) if you rev this). > >>> > >>> I don't see text in that section that connects S3 with _REG. If it's > >>> there, you might have to quote the relevant sentence or two in the > >>> commit log. > >> I don't think there is anything the spec connecting this > >> with S3. At least from my perspective S3 is the reason > >> this was exposed but there is a deficiency that exists > >> that _REG is not being called by Linux. > >> > >> I intend to re-word the commit message something to the > >> effect of explaining what _REG does and why _REG should be > >> called, along with citations. > >> > >> Then in another paragraph "Fixing this resolves an issue ...". > >> > >>> You mentioned _REG being sort of a mutex to synchronize OSPM vs > >>> platform access; if there's spec language to that effect, let's cite > >>> it. > >> That sentence I included was cited from the spec. > > If it's necessary to justify the commit, include the citation in the > > commit log. > > > >>> Ideally we should have been able to read the PCI and ACPI specs and > >>> implement this without tripping over problem on this particular > >>> hardware. I'm looking for the text that enables that "clean-room" > >>> implementation. If the spec doesn't have that text, it's either a > >>> hole in the spec or a BIOS defect that depends on something the spec > >>> doesn't require. > >> IMO both the spec and BIOS are correct, it's a Linux > >> issue that _REG wasn't used. > > What tells Linux that _REG needs to be used here? If there's nothing > > that tells Linux to use _REG here, I claim it's a BIOS defect. I'm > > happy to be convinced otherwise; the way to convince me is to point to > > the spec. > From the spec it says "control methods must assume > all operation regions are inaccessible until the > _REG(RegionSpace, 1) method is executed" > > It also points out the opposite: "Conversely, > control methods must not access fields in > operation regions when _REG method execution > has not indicated that the operation region > handler is ready." > > The ACPI spec doesn't refer to D3 in this context, but > it does make an allusion to power off in an example case. > > "Also, when the host controller or bridge controller > is turned off or disabled, PCI Config Space Operation > Regions for child devices are no longer available. > As such, ETH0’s _REG method will be run when it is > turned off and will again be run when PCI1 is > turned off." > > > > > If it's a BIOS defect, it's fine to work around it, but we need to > > understand that, own up to it, and make the exact requirements very > > clear. Otherwise we're likely to break this in the future because > > future developers and maintainers will rely on the specs. > From my discussions with BIOS developers, this is entirely > intended behavior based on the _REG section in the spec. > >>> Doing this in pci_save_state() still seems wrong to me. For example, > >>> e1000_probe() calls pci_save_state(), but this is not part of suspend. > >>> IIUC, this patch will disconnect the opregion when we probe an e1000 > >>> NIC. Is that what you intend? > >> Thanks for pointing this one out. I was narrowly focused > >> on callers in PCI core. This was a caller I wasn't > >> aware of; I agree it doesn't make sense. > >> > >> I think pci_set_power_state() might be another good > >> candidate to use. What do you think of this? > > I can't suggest a call site because (1) I'm not a power management > > person, and (2) I don't think we have a clear statement of when it is > > required. This must be expressed in terms of PCI power state > > transitions, or at least something discoverable from a pci_dev, not > > "s2idle" or even "S3" because those are meaningless in the PCI > > context. > > > > Bjorn > Right; I'm with you on not putting it with a suspend > transition. > > The spec indicates that control methods can't access > the regions until _REG is called, so > my leaning is to keep the call at init time, and > then add another call for the D3 and D0 transitions > which is why I think pci_set_power_state() is probably > best. Except that it is not used in all transitions; see the callers oif pci_power_up() for example. Technically, the config space should become accessible right after acpi_pci_set_power_state() has transitioned the device into D0 and it should be accessible still right before acpi_pci_set_power_state() attempts to transition the device into a low-power state, so it looks like _REG could be evaluated from there.