On 19.11.2021 14:51, Takashi Iwai wrote: > On Thu, 18 Nov 2021 23:13:50 +0100, > Heiner Kallweit wrote: >> >> On 18.11.2021 22:28, Takashi Iwai wrote: >>> On Thu, 18 Nov 2021 21:33:34 +0100, >>> Heiner Kallweit wrote: >>>> >>>> I get the following warning caused by 4f66a9ef37d3 ("ALSA: hda: intel: More >>>> comprehensive PM runtime setup for controller driver"): >>>> >>>> snd_hda_intel 0000:00:1f.3: Unbalanced pm_runtime_enable! >>>> >>>> Not sure how this patch was tested because the warning is obvious. >>>> The patch doesn't consider what the PCI sub-system does with regard to >>>> RPM. Have a look at pci_pm_init(). >>>> >>>> I'd understand to add the call to pm_runtime_dont_use_autosuspend(), >>>> but for all other added calls I see no justification. >>>> >>>> If being unsure about when to use which RPM call best involve >>>> linux-pm@xxxxxxxxxxxxxxx. >>> >>> Thanks for the notice. It's been through Intel CI and tests on a few >>> local machines, maybe we haven't checked carefully those errors but >>> only concentrated on the other issues, as it seems. >>> >>> There were two problems: one was the runtime PM being kicked off even >>> during the PCI driver remove call, and another was the proper runtime >>> PM setup after re-binding. >>> >> >> Having a look at the commit message of "ALSA: hda: fix general protection >> fault in azx_runtime_idle" the following sounds weird: >> >> - pci-driver.c:pm_runtime_put_sync() leads to a call >> to rpm_idle() which again calls azx_runtime_idle() >> >> rpm_idle() is only called if usage_count is 1 when entering >> pm_runtime_put_sync. And this should not be the case. >> pm_runtime_get_sync() increments the usage counter before remove() >> is called, and remove() should also increment the usage counter. >> This doesn't seem to happen. Maybe for whatever reason >> pm_runtime_get_noresume() isn't called in azx_free(), or azx_free() >> isn't called from remove(). >> I think you should trace the call chain from the PCI core calling >> remove() to pm_runtime_get_noresume() getting called or not. > > Neither of them, supposedly. Now I took a deeper look at the code > around it and dug into the git log, and found that the likely problem > was the recent PCI core code refactoring (removal of pci->driver, etc) > that have been already reverted; that was why linux-next-20211109 was > broken and linux-next-20211110 worked. With the leftover pci->driver, > the stale runtime PM callback was called at the pm_runtime_put_sync() > call in pci_device_remove(). > I also noticed that partially I was on the wrong path. > In anyway, I'll drop the invalid calls of pm_runtime_enable() / > disable() & co. Maybe keeping pm_runtime_forbid() and > pm_runtime_dont_use_autosuspend() at remove still makes some sense as > a counter-part for the probe calls, though. > The call to pm_runtime_forbid() in pci_pm_init() is a heritage from early ACPI times when broken ACPI implementations had problems with RPM. There's a discussion (w/o result yet) to enable RPM per default for newer ACPI versions. Calling pm_runtime_forbid() in the driver removal path isn't strictly needed but it doesn't hurt. > > thanks, > > Takashi >