On Thu, 2019-12-05 at 09:32 -0800, Todd Brandt wrote: > On Wed, 2019-12-04 at 22:04 +0800, Zhang Rui wrote: > > On Wed, 2019-12-04 at 02:54 +0100, Rafael J. Wysocki wrote: > > > From: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx> > > > > > > Certain ACPI-enumerated devices represented as platform devices > > > in > > > Linux, like fans, require special low-level power management > > > handling > > > implemented by their drivers that is not in agreement with the > > > ACPI > > > PM domain behavior. That leads to problems with managing ACPI > > > fans > > > during system-wide suspend and resume. > > > > > > For this reason, make acpi_dev_pm_attach() skip the affected > > > devices > > > by adding a list of device IDs to avoid to it and putting the IDs > > > of > > > the affected devices into that list. > > > > > > Fixes: e5cc8ef31267 (ACPI / PM: Provide ACPI PM callback routines > > > for > > > subsystems) > > > Reported-by: Zhang Rui <rui.zhang@xxxxxxxxx> > > > Cc: 3.10+ <stable@xxxxxxxxxxxxxxx> # 3.10+ > > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx> > > > --- > > > > > > Rui, > > > > > > Please test this on the machine(s) affected by the fan > > > suspend/resume > > > issues. > > > > Sure, Todd and I will re-run stress test with this patch applied > > when > > 5.5-rc1 released. > > I've applied it 5.4.0 and will do a full stress test run this weekend > in the lab (where 7 out of 20 machines have this issue). The kernel > will be called "5.4.0-acpifanfix", and the data should be ready > Sunday > Dec 8. > > This is the issue I'll test for: > https://bugzilla.kernel.org/show_bug.cgi?id=204321 > The test data is in and I'm very happy to report that the patch works extremely well. Here's the data gathered on the bugzilla issue 204321: ACPI fan resume is too long. [With the patch] Kernel Host Test Run Count Rate 5.4.0-acpifanfix otcpl-whl-u-clear mem-x2351 1 0.04% 5.4.0-acpifanfix otcpl-z170x-ud5 mem-x2369 0 0% 5.4.0-acpifanfix otcpl-z170x-ud5 freeze-x2492 0 0% 5.4.0-acpifanfix otcpl-whl-u-clear freeze-x2411 0 0% 5.4.0-acpifanfix otcpl-whl-u mem-x2289 0 0% 5.4.0-acpifanfix otcpl-whl-u freeze-x2389 0 0% 5.4.0-acpifanfix otcpl-icl-u-2 mem-x2273 0 0% 5.4.0-acpifanfix otcpl-icl-u-2 freeze-x817 0 0% 5.4.0-acpifanfix otcpl-glk-rvp-1 freeze-x122 0 0% 5.4.0-acpifanfix otcpl-dell-p5510-xeon-1 mem-x1851 0 0% 5.4.0-acpifanfix otcpl-dell-p5510-xeon-1 freeze-x1994 0 0% 5.4.0-acpifanfix otcpl-dell-inspiron-3493 mem-x1722 0 0% 5.4.0-acpifanfix otcpl-dell-inspiron-3493 freeze-x2102 0 0% 5.4.0-acpifanfix otcpl-cfl-u-01 mem-x39 0 0% 5.4.0-acpifanfix otcpl-cfl-u-01 freeze-x2091 0 0% 5.4.0-acpifanfix otcpl-cfl-h mem-x415 0 0% 5.4.0-acpifanfix otcpl-cfl-h freeze-x2265 0 0% 5.4.0-acpifanfix otcpl-aml-y mem-x3126 0 0% 5.4.0-acpifanfix otcpl-aml-y freeze-x2288 0 0% [Without the patch] Kernel Host Test Run Count Rate 5.3.0+ otcpl-icl-u-2 mem-x2571 2571 100.00% 5.3.0+ otcpl-whl-u mem-x2497 2497 100.00% 5.3.0+ otcpl-cfl-h mem-x2068 2068 100.00% 5.3.0+ otcpl-glk-rvp-1 mem-x75 74 98.67% 5.3.0+ otcpl-cfl-u-01 mem-x1074 1050 97.77% 5.3.0+ otcpl-glk-rvp-1 freeze-x45 12 26.67% 5.3.0+ otcpl-whl-u freeze-x2649 428 16.16% 5.3.0+ otcpl-aml-y freeze-x2434 373 15.32% 5.3.0+ otcpl-cfl-u-01 freeze-x2419 123 5.08% 5.3.0+ otcpl-latexo-ivb-cpt freeze-x1914 97 5.07% 5.3.0+ otcpl-whl-u-clear mem-x2640 69 2.61% 5.3.0+ otcpl-icl-u-2 freeze-x2757 59 2.14% 5.3.0+ otcpl-whl-u-clear freeze-x2830 53 1.87% 5.3.0+ otcpl-tgl-rvp freeze-x2086 20 0.96% 5.3.0+ otcpl-cfl-h freeze-x2457 8 0.33% 5.3.0+ otcpl-latexo-ivb-cpt mem-x2000 4 0.20% 5.3.0+ otcpl-aml-y mem-x2727 2 0.07% 5.3.0+ otcpl-z170x-ud5 mem-x2669 0 0% 5.3.0+ otcpl-z170x-ud5 freeze-x2881 0 0% 5.3.0+ otcpl-lenovo-ideapad-130 mem-x2000 0 0% 5.3.0+ otcpl-lenovo-ideapad-130 freeze-x2000 0 0% 5.3.0+ otcpl-dell-p5510-xeon-2 mem-x570 0 0% 5.3.0+ otcpl-dell-p5510-xeon-2 freeze-x1093 0 0% 5.3.0+ otcpl-dell-p5510-xeon-1 mem-x2209 0 0% 5.3.0+ otcpl-dell-p5510-xeon-1 freeze-x2431 0 0% 5.3.0+ otcpl-dell-inspiron-3493 freeze-x1170 0 0% 5.3.0+ otcpl-chromebook-hsw freeze-x1305 0 0% 5.3.0+ otcpl-chromebook-hsw freeze-x368 0 0% > > > > thanks, > > rui > > > > > > > > I don't really see any cleaner way to address this problem, > > > because > > > the > > > ACPI PM domain should not be used with the devices in question > > > even > > > if > > > the driver that binds to them is not loaded. > > > > > > Cheers, > > > Rafael > > > > > > --- > > > drivers/acpi/device_pm.c | 12 +++++++++++- > > > 1 file changed, 11 insertions(+), 1 deletion(-) > > > > > > Index: linux-pm/drivers/acpi/device_pm.c > > > ================================================================= > > > == > > > --- linux-pm.orig/drivers/acpi/device_pm.c > > > +++ linux-pm/drivers/acpi/device_pm.c > > > @@ -1314,9 +1314,19 @@ static void acpi_dev_pm_detach(struct de > > > */ > > > int acpi_dev_pm_attach(struct device *dev, bool power_on) > > > { > > > + /* > > > + * Skip devices whose ACPI companions match the device IDs > > > below, > > > + * because they require special power management handling > > > incompatible > > > + * with the generic ACPI PM domain. > > > + */ > > > + static const struct acpi_device_id special_pm_ids[] = { > > > + {"PNP0C0B", }, /* Generic ACPI fan */ > > > + {"INT3404", }, /* Fan */ > > > + {} > > > + }; > > > struct acpi_device *adev = ACPI_COMPANION(dev); > > > > > > - if (!adev) > > > + if (!adev || !acpi_match_device_ids(adev, special_pm_ids)) > > > return 0; > > > > > > /* > > > > > > > > > > > > >