Re: "EC: Install address space handler at the namespace root" causing issues for some users

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Aug 12, 2024 at 1:28 PM Hans de Goede <hdegoede@xxxxxxxxxx> wrote:
>
> Hi,
>
> On 8/8/24 7:22 PM, Rafael J. Wysocki wrote:
> > Hi,
> >
> > On Mon, Aug 5, 2024 at 2:47 PM Hans de Goede <hdegoede@xxxxxxxxxx> wrote:
> >>
> >> Hi,
> >>
> >> On 8/5/24 1:28 PM, Hans de Goede wrote:
> >>> Hi,
> >>>
> >>> On 8/1/24 4:28 PM, Hans de Goede wrote:
> >>>> Hi,
> >>>>
> >>>> On 7/29/24 1:15 PM, Hans de Goede wrote:
> >>>>> Hi Rafael,
> >>>>>
> >>>>> There are 2 bug reports:
> >>>>>
> >>>>> 1. Brightness up/down key-presses no longer working on LG laptop (acpi-video related):
> >>>>> https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx/thread/V2KWAGZIAX4TOWPCH6A6FSIT66PR3KMZ/
> >>>>
> >>>> I have filed:
> >>>>
> >>>> https://bugzilla.redhat.com/show_bug.cgi?id=2302253
> >>>>
> >>>> to track this now and an acpidump of the troublesome LG laptop
> >>>> is attached there. I have also requested dmesg output of
> >>>> a non working kernel to be attached there.
> >>>>
> >>>> As a reminder this is the bug where it has been confirmed that
> >>>> reverting "EC: Install address space handler at the namespace root"
> >>>> helps, with the caveat that there is a Thunderbolt related IRQ
> >>>> storm on the ACPI event IRQ after the revert ...
> >>>
> >>> Ok, so the bugzilla now has 2 different dmesg outputs:
> >>>
> >>> 1. 6.9.6, this kernel works without problems
> >>>
> >>> 2. 6.9.12 with the following patch you suggested on top:
> >>>
> >>> --- a/drivers/acpi/ec.c
> >>> +++ b/drivers/acpi/ec.c
> >>> @@ -1788,7 +1788,7 @@ void __init acpi_ec_dsdt_probe(void)
> >>>        * At this point, the GPE is not fully initialized, so do not to
> >>>        * handle the events.
> >>>        */
> >>> -     ret = acpi_ec_setup(ec, NULL, true);
> >>> +     ret = acpi_ec_setup(ec, NULL, false);
> >>>       if (ret) {
> >>>               acpi_ec_free(ec);
> >>>               return;
> >>>
> >>> Unfortunately this does not help. dmesg shows some EC _REG errors, which
> >>> are now (with the above diff applied) shown just before the
> >>> "Boot DSDT EC initialization complete" message, which shows that _REG now
> >>> runs from acpi_ec_add() rather then before:
> >>>
> >>> [    1.007566] ACPI BIOS Error (bug): Could not resolve symbol [\_TZ.FN00._OFF], AE_NOT_FOUND (20230628/psargs-330)
> >>> [    1.007576] ACPI Error: Aborting method \_SB.PC00.LPCB.H_EC.EREG due to previous error (AE_NOT_FOUND) (20230628/psparse-52
> >>> [    1.007580] ACPI Error: Aborting method \_SB.PC00.LPCB.H_EC._REG due to previous error (AE_NOT_FOUND) (20230628/psparse-52
> >>> [    1.007639] ACPI: EC: interrupt unblocked
> >>> [    1.007640] ACPI: EC: event unblocked
> >>> [    1.007675] ACPI: EC: EC_CMD/EC_SC=0x66, EC_DATA=0x62
> >>> [    1.007676] ACPI: EC: GPE=0x6e
> >>> [    1.007677] ACPI: \_SB_.PC00.LPCB.LGEC: Boot DSDT EC initialization complete
> >>> [    1.007679] ACPI: \_SB_.PC00.LPCB.LGEC: EC: Used to handle transactions and events
> >>>
> >>> Note that the errors are from calling _REG on \_SB.PC00.LPCB.H_EC, where as the actual
> >>> EC (and the only acpi_device on which _REG would get called for the EC Opregion before) is:
> >>> \_SB_.PC00.LPCB.LGEC.
> >>>
> >>> Looking at the DSDT it seems that the H_EC is not used and is leftover from a copy/paste
> >>> from some reference design DSDT. Its _REG however does write to the EC before hitting the error
> >>> and I think that that write may be causing the issue...
> >>>
> >>> The H_EC device does have an _STA method and looking closer the troublesome EREG method is
> >>> also called from _INI. So I guess that _STA is returning 0 causing _INI to not run and
> >>> that is the reason why we are not seeing the same EREG errors with kernel 6.9.6 where _REG is
> >>> only called for the EC opregion on \_SB_.PC00.LPCB.LGEC and not for the entire ACPI device
> >>> hierarchy as is done with >= 6.9.7 .
> >>>
> >>> Maybe we should only call _REG for the EC opregion on present devices (and devices without
> >>> a _STA)?
> >>>
> >>> Also note that both LGEC and H_EC use the same cmd + data ports.
> >>>
> >>> I'll go and ask the reporter to retrieve the status of both LGEC and H_EC and then see
> >>> from there.
> >>
> >> The reporter has confirmed that of the 2 EC devices ( H_EC / LGEC ) only LGEC returns
> >> a _STA of non 0:
> >>
> >>> Here it is, with kernel 6.9.6:
> >>>
> >>> $ cat /sys/bus/acpi/devices/PNP0C09\:00/path
> >>> \_SB_.PC00.LPCB.H_EC
> >>> $ cat /sys/bus/acpi/devices/PNP0C09\:00/status
> >>> 0
> >>> $ cat /sys/bus/acpi/devices/PNP0C09\:01/path
> >>> \_SB_.PC00.LPCB.LGEC
> >>> $ cat /sys/bus/acpi/devices/PNP0C09\:01/status
> >>> 15
> >>
> >> And taking a second look at the other bug:
> >> https://bugzilla.redhat.com/show_bug.cgi?id=2298938
> >>
> >> That one also has 2 EC ACPI devices and the errors come from calling _REG on the one
> >> which is not picked as the boot_ec :
> >>
> >> jul 19 17:33:41 kernel: ACPI: EC: EC started
> >> jul 19 17:33:41 kernel: ACPI: EC: interrupt blocked
> >> jul 19 17:33:41 kernel: ACPI Error: Aborting method \_SB.PCI0.LPCB.H_EC.ECMD due to previous error (AE_AML_LOOP_TIMEOUT) (20230628/psparse-529)
> >> jul 19 17:33:41 kernel: ACPI Error: Aborting method \_TZ.FNCL due to previous error (AE_AML_LOOP_TIMEOUT) (20230628/psparse-529)
> >> jul 19 17:33:41 kernel: ACPI Error: Aborting method \_TZ.FN00._OFF due to previous error (AE_AML_LOOP_TIMEOUT) (20230628/psparse-529)
> >> jul 19 17:33:41 kernel: ACPI Error: Aborting method \_SB.PCI0.LPCB.H_EC._REG due to previous error (AE_AML_LOOP_TIMEOUT) (20230628/psparse-529)
> >> jul 19 17:33:41 kernel: ACPI: EC: EC_CMD/EC_SC=0x66, EC_DATA=0x62
> >> jul 19 17:33:41 kernel: ACPI: \_SB_.PCI0.LPCB.EC0_: Boot DSDT EC used to handle transactions
> >>
> >> Note the H_EC vs EC0_ in the errors vs the "Boot DSDT EC used to
> >> handle transactions" message.
> >>
> >> So the issue in both cases seems to be calling _REG on an unused EC acpi_device.
> >> Not sure how to best fix this though ...
> >
> > I have created an experimental acpi-ec-fixes branch:
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git/log/?h=acpi-ec-fixes
> >
> > for this which illustrates my idea (untested so far).
>
> Thank you. I believe that the approach taken here is good and I also
> like the code (of the current version) so you may add my:
>
> Reviewed-by: Hans de Goede <hdegoede@xxxxxxxxxx>
>
> to all 3 patches.
>
> I have started a test Fedora 40 6.10.4 kernel build with the patches from
> the acpi-ec-fixes branch added:
>
> https://koji.fedoraproject.org/koji/taskinfo?taskID=121834209
>
> and I have asked the reporters of both bugs:
>
>     2298938 - Multiple ACPI errors resulting in incorrect thermal readings and misleading CPU
>     2302253 - ACPI: EC: LG gram laptop brightness keys stop working with kernel >= 6.9.7
>
> to test this. I expect a good turn around time from the reporter
> of bug 2302253. So far the reporter of 2298938 is not really
> responsive (holidays?).

Thank you!

I'll send the patches to the list later today.





[Index of Archives]     [Linux IBM ACPI]     [Linux Power Management]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]
  Powered by Linux