On Mon, Aug 12, 2024 at 1:28 PM Hans de Goede <hdegoede@xxxxxxxxxx> wrote: > > Hi, > > On 8/8/24 7:22 PM, Rafael J. Wysocki wrote: > > Hi, > > > > On Mon, Aug 5, 2024 at 2:47 PM Hans de Goede <hdegoede@xxxxxxxxxx> wrote: > >> > >> Hi, > >> > >> On 8/5/24 1:28 PM, Hans de Goede wrote: > >>> Hi, > >>> > >>> On 8/1/24 4:28 PM, Hans de Goede wrote: > >>>> Hi, > >>>> > >>>> On 7/29/24 1:15 PM, Hans de Goede wrote: > >>>>> Hi Rafael, > >>>>> > >>>>> There are 2 bug reports: > >>>>> > >>>>> 1. Brightness up/down key-presses no longer working on LG laptop (acpi-video related): > >>>>> https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx/thread/V2KWAGZIAX4TOWPCH6A6FSIT66PR3KMZ/ > >>>> > >>>> I have filed: > >>>> > >>>> https://bugzilla.redhat.com/show_bug.cgi?id=2302253 > >>>> > >>>> to track this now and an acpidump of the troublesome LG laptop > >>>> is attached there. I have also requested dmesg output of > >>>> a non working kernel to be attached there. > >>>> > >>>> As a reminder this is the bug where it has been confirmed that > >>>> reverting "EC: Install address space handler at the namespace root" > >>>> helps, with the caveat that there is a Thunderbolt related IRQ > >>>> storm on the ACPI event IRQ after the revert ... > >>> > >>> Ok, so the bugzilla now has 2 different dmesg outputs: > >>> > >>> 1. 6.9.6, this kernel works without problems > >>> > >>> 2. 6.9.12 with the following patch you suggested on top: > >>> > >>> --- a/drivers/acpi/ec.c > >>> +++ b/drivers/acpi/ec.c > >>> @@ -1788,7 +1788,7 @@ void __init acpi_ec_dsdt_probe(void) > >>> * At this point, the GPE is not fully initialized, so do not to > >>> * handle the events. > >>> */ > >>> - ret = acpi_ec_setup(ec, NULL, true); > >>> + ret = acpi_ec_setup(ec, NULL, false); > >>> if (ret) { > >>> acpi_ec_free(ec); > >>> return; > >>> > >>> Unfortunately this does not help. dmesg shows some EC _REG errors, which > >>> are now (with the above diff applied) shown just before the > >>> "Boot DSDT EC initialization complete" message, which shows that _REG now > >>> runs from acpi_ec_add() rather then before: > >>> > >>> [ 1.007566] ACPI BIOS Error (bug): Could not resolve symbol [\_TZ.FN00._OFF], AE_NOT_FOUND (20230628/psargs-330) > >>> [ 1.007576] ACPI Error: Aborting method \_SB.PC00.LPCB.H_EC.EREG due to previous error (AE_NOT_FOUND) (20230628/psparse-52 > >>> [ 1.007580] ACPI Error: Aborting method \_SB.PC00.LPCB.H_EC._REG due to previous error (AE_NOT_FOUND) (20230628/psparse-52 > >>> [ 1.007639] ACPI: EC: interrupt unblocked > >>> [ 1.007640] ACPI: EC: event unblocked > >>> [ 1.007675] ACPI: EC: EC_CMD/EC_SC=0x66, EC_DATA=0x62 > >>> [ 1.007676] ACPI: EC: GPE=0x6e > >>> [ 1.007677] ACPI: \_SB_.PC00.LPCB.LGEC: Boot DSDT EC initialization complete > >>> [ 1.007679] ACPI: \_SB_.PC00.LPCB.LGEC: EC: Used to handle transactions and events > >>> > >>> Note that the errors are from calling _REG on \_SB.PC00.LPCB.H_EC, where as the actual > >>> EC (and the only acpi_device on which _REG would get called for the EC Opregion before) is: > >>> \_SB_.PC00.LPCB.LGEC. > >>> > >>> Looking at the DSDT it seems that the H_EC is not used and is leftover from a copy/paste > >>> from some reference design DSDT. Its _REG however does write to the EC before hitting the error > >>> and I think that that write may be causing the issue... > >>> > >>> The H_EC device does have an _STA method and looking closer the troublesome EREG method is > >>> also called from _INI. So I guess that _STA is returning 0 causing _INI to not run and > >>> that is the reason why we are not seeing the same EREG errors with kernel 6.9.6 where _REG is > >>> only called for the EC opregion on \_SB_.PC00.LPCB.LGEC and not for the entire ACPI device > >>> hierarchy as is done with >= 6.9.7 . > >>> > >>> Maybe we should only call _REG for the EC opregion on present devices (and devices without > >>> a _STA)? > >>> > >>> Also note that both LGEC and H_EC use the same cmd + data ports. > >>> > >>> I'll go and ask the reporter to retrieve the status of both LGEC and H_EC and then see > >>> from there. > >> > >> The reporter has confirmed that of the 2 EC devices ( H_EC / LGEC ) only LGEC returns > >> a _STA of non 0: > >> > >>> Here it is, with kernel 6.9.6: > >>> > >>> $ cat /sys/bus/acpi/devices/PNP0C09\:00/path > >>> \_SB_.PC00.LPCB.H_EC > >>> $ cat /sys/bus/acpi/devices/PNP0C09\:00/status > >>> 0 > >>> $ cat /sys/bus/acpi/devices/PNP0C09\:01/path > >>> \_SB_.PC00.LPCB.LGEC > >>> $ cat /sys/bus/acpi/devices/PNP0C09\:01/status > >>> 15 > >> > >> And taking a second look at the other bug: > >> https://bugzilla.redhat.com/show_bug.cgi?id=2298938 > >> > >> That one also has 2 EC ACPI devices and the errors come from calling _REG on the one > >> which is not picked as the boot_ec : > >> > >> jul 19 17:33:41 kernel: ACPI: EC: EC started > >> jul 19 17:33:41 kernel: ACPI: EC: interrupt blocked > >> jul 19 17:33:41 kernel: ACPI Error: Aborting method \_SB.PCI0.LPCB.H_EC.ECMD due to previous error (AE_AML_LOOP_TIMEOUT) (20230628/psparse-529) > >> jul 19 17:33:41 kernel: ACPI Error: Aborting method \_TZ.FNCL due to previous error (AE_AML_LOOP_TIMEOUT) (20230628/psparse-529) > >> jul 19 17:33:41 kernel: ACPI Error: Aborting method \_TZ.FN00._OFF due to previous error (AE_AML_LOOP_TIMEOUT) (20230628/psparse-529) > >> jul 19 17:33:41 kernel: ACPI Error: Aborting method \_SB.PCI0.LPCB.H_EC._REG due to previous error (AE_AML_LOOP_TIMEOUT) (20230628/psparse-529) > >> jul 19 17:33:41 kernel: ACPI: EC: EC_CMD/EC_SC=0x66, EC_DATA=0x62 > >> jul 19 17:33:41 kernel: ACPI: \_SB_.PCI0.LPCB.EC0_: Boot DSDT EC used to handle transactions > >> > >> Note the H_EC vs EC0_ in the errors vs the "Boot DSDT EC used to > >> handle transactions" message. > >> > >> So the issue in both cases seems to be calling _REG on an unused EC acpi_device. > >> Not sure how to best fix this though ... > > > > I have created an experimental acpi-ec-fixes branch: > > > > https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git/log/?h=acpi-ec-fixes > > > > for this which illustrates my idea (untested so far). > > Thank you. I believe that the approach taken here is good and I also > like the code (of the current version) so you may add my: > > Reviewed-by: Hans de Goede <hdegoede@xxxxxxxxxx> > > to all 3 patches. > > I have started a test Fedora 40 6.10.4 kernel build with the patches from > the acpi-ec-fixes branch added: > > https://koji.fedoraproject.org/koji/taskinfo?taskID=121834209 > > and I have asked the reporters of both bugs: > > 2298938 - Multiple ACPI errors resulting in incorrect thermal readings and misleading CPU > 2302253 - ACPI: EC: LG gram laptop brightness keys stop working with kernel >= 6.9.7 > > to test this. I expect a good turn around time from the reporter > of bug 2302253. So far the reporter of 2298938 is not really > responsive (holidays?). Thank you! I'll send the patches to the list later today.