Hi, On 8/9/24 8:02 PM, Armin Wolf wrote: > Am 09.08.24 um 15:17 schrieb Rafael J. Wysocki: > >> On Thu, Aug 8, 2024 at 7:22 PM Rafael J. Wysocki <rafael@xxxxxxxxxx> wrote: >>> Hi, >>> >>> On Mon, Aug 5, 2024 at 2:47 PM Hans de Goede <hdegoede@xxxxxxxxxx> wrote: >>>> Hi, >>>> >>>> On 8/5/24 1:28 PM, Hans de Goede wrote: >>>>> Hi, >>>>> >>>>> On 8/1/24 4:28 PM, Hans de Goede wrote: >>>>>> Hi, >>>>>> >>>>>> On 7/29/24 1:15 PM, Hans de Goede wrote: >>>>>>> Hi Rafael, >>>>>>> >>>>>>> There are 2 bug reports: >>>>>>> >>>>>>> 1. Brightness up/down key-presses no longer working on LG laptop (acpi-video related): >>>>>>> https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx/thread/V2KWAGZIAX4TOWPCH6A6FSIT66PR3KMZ/ >>>>>> I have filed: >>>>>> >>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=2302253 >>>>>> >>>>>> to track this now and an acpidump of the troublesome LG laptop >>>>>> is attached there. I have also requested dmesg output of >>>>>> a non working kernel to be attached there. >>>>>> >>>>>> As a reminder this is the bug where it has been confirmed that >>>>>> reverting "EC: Install address space handler at the namespace root" >>>>>> helps, with the caveat that there is a Thunderbolt related IRQ >>>>>> storm on the ACPI event IRQ after the revert ... >>>>> Ok, so the bugzilla now has 2 different dmesg outputs: >>>>> >>>>> 1. 6.9.6, this kernel works without problems >>>>> >>>>> 2. 6.9.12 with the following patch you suggested on top: >>>>> >>>>> --- a/drivers/acpi/ec.c >>>>> +++ b/drivers/acpi/ec.c >>>>> @@ -1788,7 +1788,7 @@ void __init acpi_ec_dsdt_probe(void) >>>>> * At this point, the GPE is not fully initialized, so do not to >>>>> * handle the events. >>>>> */ >>>>> - ret = acpi_ec_setup(ec, NULL, true); >>>>> + ret = acpi_ec_setup(ec, NULL, false); >>>>> if (ret) { >>>>> acpi_ec_free(ec); >>>>> return; >>>>> >>>>> Unfortunately this does not help. dmesg shows some EC _REG errors, which >>>>> are now (with the above diff applied) shown just before the >>>>> "Boot DSDT EC initialization complete" message, which shows that _REG now >>>>> runs from acpi_ec_add() rather then before: >>>>> >>>>> [ 1.007566] ACPI BIOS Error (bug): Could not resolve symbol [\_TZ.FN00._OFF], AE_NOT_FOUND (20230628/psargs-330) >>>>> [ 1.007576] ACPI Error: Aborting method \_SB.PC00.LPCB.H_EC.EREG due to previous error (AE_NOT_FOUND) (20230628/psparse-52 >>>>> [ 1.007580] ACPI Error: Aborting method \_SB.PC00.LPCB.H_EC._REG due to previous error (AE_NOT_FOUND) (20230628/psparse-52 >>>>> [ 1.007639] ACPI: EC: interrupt unblocked >>>>> [ 1.007640] ACPI: EC: event unblocked >>>>> [ 1.007675] ACPI: EC: EC_CMD/EC_SC=0x66, EC_DATA=0x62 >>>>> [ 1.007676] ACPI: EC: GPE=0x6e >>>>> [ 1.007677] ACPI: \_SB_.PC00.LPCB.LGEC: Boot DSDT EC initialization complete >>>>> [ 1.007679] ACPI: \_SB_.PC00.LPCB.LGEC: EC: Used to handle transactions and events >>>>> >>>>> Note that the errors are from calling _REG on \_SB.PC00.LPCB.H_EC, where as the actual >>>>> EC (and the only acpi_device on which _REG would get called for the EC Opregion before) is: >>>>> \_SB_.PC00.LPCB.LGEC. >>>>> >>>>> Looking at the DSDT it seems that the H_EC is not used and is leftover from a copy/paste >>>>> from some reference design DSDT. Its _REG however does write to the EC before hitting the error >>>>> and I think that that write may be causing the issue... >>>>> >>>>> The H_EC device does have an _STA method and looking closer the troublesome EREG method is >>>>> also called from _INI. So I guess that _STA is returning 0 causing _INI to not run and >>>>> that is the reason why we are not seeing the same EREG errors with kernel 6.9.6 where _REG is >>>>> only called for the EC opregion on \_SB_.PC00.LPCB.LGEC and not for the entire ACPI device >>>>> hierarchy as is done with >= 6.9.7 . >>>>> >>>>> Maybe we should only call _REG for the EC opregion on present devices (and devices without >>>>> a _STA)? >>>>> >>>>> Also note that both LGEC and H_EC use the same cmd + data ports. >>>>> >>>>> I'll go and ask the reporter to retrieve the status of both LGEC and H_EC and then see >>>>> from there. >>>> The reporter has confirmed that of the 2 EC devices ( H_EC / LGEC ) only LGEC returns >>>> a _STA of non 0: >>>> >>>>> Here it is, with kernel 6.9.6: >>>>> >>>>> $ cat /sys/bus/acpi/devices/PNP0C09\:00/path >>>>> \_SB_.PC00.LPCB.H_EC >>>>> $ cat /sys/bus/acpi/devices/PNP0C09\:00/status >>>>> 0 >>>>> $ cat /sys/bus/acpi/devices/PNP0C09\:01/path >>>>> \_SB_.PC00.LPCB.LGEC >>>>> $ cat /sys/bus/acpi/devices/PNP0C09\:01/status >>>>> 15 >>>> And taking a second look at the other bug: >>>> https://bugzilla.redhat.com/show_bug.cgi?id=2298938 >>>> >>>> That one also has 2 EC ACPI devices and the errors come from calling _REG on the one >>>> which is not picked as the boot_ec : >>>> >>>> jul 19 17:33:41 kernel: ACPI: EC: EC started >>>> jul 19 17:33:41 kernel: ACPI: EC: interrupt blocked >>>> jul 19 17:33:41 kernel: ACPI Error: Aborting method \_SB.PCI0.LPCB.H_EC.ECMD due to previous error (AE_AML_LOOP_TIMEOUT) (20230628/psparse-529) >>>> jul 19 17:33:41 kernel: ACPI Error: Aborting method \_TZ.FNCL due to previous error (AE_AML_LOOP_TIMEOUT) (20230628/psparse-529) >>>> jul 19 17:33:41 kernel: ACPI Error: Aborting method \_TZ.FN00._OFF due to previous error (AE_AML_LOOP_TIMEOUT) (20230628/psparse-529) >>>> jul 19 17:33:41 kernel: ACPI Error: Aborting method \_SB.PCI0.LPCB.H_EC._REG due to previous error (AE_AML_LOOP_TIMEOUT) (20230628/psparse-529) >>>> jul 19 17:33:41 kernel: ACPI: EC: EC_CMD/EC_SC=0x66, EC_DATA=0x62 >>>> jul 19 17:33:41 kernel: ACPI: \_SB_.PCI0.LPCB.EC0_: Boot DSDT EC used to handle transactions >>>> >>>> Note the H_EC vs EC0_ in the errors vs the "Boot DSDT EC used to >>>> handle transactions" message. >>>> >>>> So the issue in both cases seems to be calling _REG on an unused EC acpi_device. >>>> Not sure how to best fix this though ... >>> I have created an experimental acpi-ec-fixes branch: >>> >>> https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git/log/?h=acpi-ec-fixes >>> >>> for this which illustrates my idea (untested so far). >> This has just been updated to actually register the EC _REG for all >> devices (which would have been skipped for the majority of them due to >> the previous location of the relevant function call). > > Hi, > > i will ask the reporter https://bugzilla.kernel.org/show_bug.cgi?id=219075 to try out your patches. > His LG Gram seems to be affected by this problem (it has two overlapping EC devices ...). Ah another reporter with an LG Gram, this Fedora bug: https://bugzilla.redhat.com/show_bug.cgi?id=2302253 is also about a LG Gram. And shows the same: [ 0.239044] ACPI: EC: EC started [ 0.239045] ACPI: EC: interrupt blocked [ 0.239787] ACPI BIOS Error (bug): Could not resolve symbol [\_TZ.FN00._OFF], AE_NOT_FOUND (20240322/psargs-330) [ 0.239794] fbcon: Taking over console [ 0.239800] ACPI Error: Aborting method \_SB.PC00.LPCB.H_EC.EREG due to previous error (AE_NOT_FOUND) (20240322/psparse- [ 0.239805] ACPI Error: Aborting method \_SB.PC00.LPCB.H_EC._REG due to previous error (AE_NOT_FOUND) (20240322/psparse- [ 0.239820] ACPI: EC: EC_CMD/EC_SC=0x66, EC_DATA=0x62 [ 0.239822] ACPI: \_SB_.PC00.LPCB.LGEC: Boot DSDT EC used to handle transactions errors in their dmesg with regressed kernels. So this seems to be the same issue. Regards, Hans > > Thanks, > Armin Wolf > >>> The underlying observation is that _REG only needs to be evaluated for >>> EC operation regions located in the scopes of ACPI device objects >>> representing valid devices, so it is better to do it for each of these >>> objects individually in acpi_bus_attach(). >>> >>> For the EC itself, it is better to do what was done before the >>> $subject commit, so evaluate _REG for the EC operation regions in the >>> EC scope (including the "orphan" _REG). >>> >>> Accordingly, commit 0e6b6dedf168 ("Revert "ACPI: EC: Evaluate orphan >>> _REG under EC device") is reverted, acpi_execute_reg_methods() is >>> modified to take an additional depth argument and it is called for >>> each device object representing a valid device with that argument >>> equal to 1. >