Re: "EC: Install address space handler at the namespace root" causing issues for some users

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On 8/12/24 1:28 PM, Hans de Goede wrote:
> Hi,
> 
> On 8/8/24 7:22 PM, Rafael J. Wysocki wrote:
>> Hi,
>>
>> On Mon, Aug 5, 2024 at 2:47 PM Hans de Goede <hdegoede@xxxxxxxxxx> wrote:
>>>
>>> Hi,
>>>
>>> On 8/5/24 1:28 PM, Hans de Goede wrote:
>>>> Hi,
>>>>
>>>> On 8/1/24 4:28 PM, Hans de Goede wrote:
>>>>> Hi,
>>>>>
>>>>> On 7/29/24 1:15 PM, Hans de Goede wrote:
>>>>>> Hi Rafael,
>>>>>>
>>>>>> There are 2 bug reports:
>>>>>>
>>>>>> 1. Brightness up/down key-presses no longer working on LG laptop (acpi-video related):
>>>>>> https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx/thread/V2KWAGZIAX4TOWPCH6A6FSIT66PR3KMZ/
>>>>>
>>>>> I have filed:
>>>>>
>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=2302253
>>>>>
>>>>> to track this now and an acpidump of the troublesome LG laptop
>>>>> is attached there. I have also requested dmesg output of
>>>>> a non working kernel to be attached there.
>>>>>
>>>>> As a reminder this is the bug where it has been confirmed that
>>>>> reverting "EC: Install address space handler at the namespace root"
>>>>> helps, with the caveat that there is a Thunderbolt related IRQ
>>>>> storm on the ACPI event IRQ after the revert ...
>>>>
>>>> Ok, so the bugzilla now has 2 different dmesg outputs:
>>>>
>>>> 1. 6.9.6, this kernel works without problems
>>>>
>>>> 2. 6.9.12 with the following patch you suggested on top:
>>>>
>>>> --- a/drivers/acpi/ec.c
>>>> +++ b/drivers/acpi/ec.c
>>>> @@ -1788,7 +1788,7 @@ void __init acpi_ec_dsdt_probe(void)
>>>>        * At this point, the GPE is not fully initialized, so do not to
>>>>        * handle the events.
>>>>        */
>>>> -     ret = acpi_ec_setup(ec, NULL, true);
>>>> +     ret = acpi_ec_setup(ec, NULL, false);
>>>>       if (ret) {
>>>>               acpi_ec_free(ec);
>>>>               return;
>>>>
>>>> Unfortunately this does not help. dmesg shows some EC _REG errors, which
>>>> are now (with the above diff applied) shown just before the
>>>> "Boot DSDT EC initialization complete" message, which shows that _REG now
>>>> runs from acpi_ec_add() rather then before:
>>>>
>>>> [    1.007566] ACPI BIOS Error (bug): Could not resolve symbol [\_TZ.FN00._OFF], AE_NOT_FOUND (20230628/psargs-330)
>>>> [    1.007576] ACPI Error: Aborting method \_SB.PC00.LPCB.H_EC.EREG due to previous error (AE_NOT_FOUND) (20230628/psparse-52
>>>> [    1.007580] ACPI Error: Aborting method \_SB.PC00.LPCB.H_EC._REG due to previous error (AE_NOT_FOUND) (20230628/psparse-52
>>>> [    1.007639] ACPI: EC: interrupt unblocked
>>>> [    1.007640] ACPI: EC: event unblocked
>>>> [    1.007675] ACPI: EC: EC_CMD/EC_SC=0x66, EC_DATA=0x62
>>>> [    1.007676] ACPI: EC: GPE=0x6e
>>>> [    1.007677] ACPI: \_SB_.PC00.LPCB.LGEC: Boot DSDT EC initialization complete
>>>> [    1.007679] ACPI: \_SB_.PC00.LPCB.LGEC: EC: Used to handle transactions and events
>>>>
>>>> Note that the errors are from calling _REG on \_SB.PC00.LPCB.H_EC, where as the actual
>>>> EC (and the only acpi_device on which _REG would get called for the EC Opregion before) is:
>>>> \_SB_.PC00.LPCB.LGEC.
>>>>
>>>> Looking at the DSDT it seems that the H_EC is not used and is leftover from a copy/paste
>>>> from some reference design DSDT. Its _REG however does write to the EC before hitting the error
>>>> and I think that that write may be causing the issue...
>>>>
>>>> The H_EC device does have an _STA method and looking closer the troublesome EREG method is
>>>> also called from _INI. So I guess that _STA is returning 0 causing _INI to not run and
>>>> that is the reason why we are not seeing the same EREG errors with kernel 6.9.6 where _REG is
>>>> only called for the EC opregion on \_SB_.PC00.LPCB.LGEC and not for the entire ACPI device
>>>> hierarchy as is done with >= 6.9.7 .
>>>>
>>>> Maybe we should only call _REG for the EC opregion on present devices (and devices without
>>>> a _STA)?
>>>>
>>>> Also note that both LGEC and H_EC use the same cmd + data ports.
>>>>
>>>> I'll go and ask the reporter to retrieve the status of both LGEC and H_EC and then see
>>>> from there.
>>>
>>> The reporter has confirmed that of the 2 EC devices ( H_EC / LGEC ) only LGEC returns
>>> a _STA of non 0:
>>>
>>>> Here it is, with kernel 6.9.6:
>>>>
>>>> $ cat /sys/bus/acpi/devices/PNP0C09\:00/path
>>>> \_SB_.PC00.LPCB.H_EC
>>>> $ cat /sys/bus/acpi/devices/PNP0C09\:00/status
>>>> 0
>>>> $ cat /sys/bus/acpi/devices/PNP0C09\:01/path
>>>> \_SB_.PC00.LPCB.LGEC
>>>> $ cat /sys/bus/acpi/devices/PNP0C09\:01/status
>>>> 15
>>>
>>> And taking a second look at the other bug:
>>> https://bugzilla.redhat.com/show_bug.cgi?id=2298938
>>>
>>> That one also has 2 EC ACPI devices and the errors come from calling _REG on the one
>>> which is not picked as the boot_ec :
>>>
>>> jul 19 17:33:41 kernel: ACPI: EC: EC started
>>> jul 19 17:33:41 kernel: ACPI: EC: interrupt blocked
>>> jul 19 17:33:41 kernel: ACPI Error: Aborting method \_SB.PCI0.LPCB.H_EC.ECMD due to previous error (AE_AML_LOOP_TIMEOUT) (20230628/psparse-529)
>>> jul 19 17:33:41 kernel: ACPI Error: Aborting method \_TZ.FNCL due to previous error (AE_AML_LOOP_TIMEOUT) (20230628/psparse-529)
>>> jul 19 17:33:41 kernel: ACPI Error: Aborting method \_TZ.FN00._OFF due to previous error (AE_AML_LOOP_TIMEOUT) (20230628/psparse-529)
>>> jul 19 17:33:41 kernel: ACPI Error: Aborting method \_SB.PCI0.LPCB.H_EC._REG due to previous error (AE_AML_LOOP_TIMEOUT) (20230628/psparse-529)
>>> jul 19 17:33:41 kernel: ACPI: EC: EC_CMD/EC_SC=0x66, EC_DATA=0x62
>>> jul 19 17:33:41 kernel: ACPI: \_SB_.PCI0.LPCB.EC0_: Boot DSDT EC used to handle transactions
>>>
>>> Note the H_EC vs EC0_ in the errors vs the "Boot DSDT EC used to
>>> handle transactions" message.
>>>
>>> So the issue in both cases seems to be calling _REG on an unused EC acpi_device.
>>> Not sure how to best fix this though ...
>>
>> I have created an experimental acpi-ec-fixes branch:
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git/log/?h=acpi-ec-fixes
>>
>> for this which illustrates my idea (untested so far).
> 
> Thank you. I believe that the approach taken here is good and I also
> like the code (of the current version) so you may add my:
> 
> Reviewed-by: Hans de Goede <hdegoede@xxxxxxxxxx>
> 
> to all 3 patches.
> 
> I have started a test Fedora 40 6.10.4 kernel build with the patches from
> the acpi-ec-fixes branch added:
> 
> https://koji.fedoraproject.org/koji/taskinfo?taskID=121834209
> 
> and I have asked the reporters of both bugs:
> 
>     2298938 - Multiple ACPI errors resulting in incorrect thermal readings and misleading CPU 
>     2302253 - ACPI: EC: LG gram laptop brightness keys stop working with kernel >= 6.9.7
> 
> to test this. I expect a good turn around time from the reporter
> of bug 2302253. So far the reporter of 2298938 is not really
> responsive (holidays?).

The reporter of 2302253 has reported that the patches from:

https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git/log/?h=acpi-ec-fixes

resolve the issue; and as mentioned before the reporter of 2298938
is not responsive atm. So I believe that with it confirmed that this
at least fixes the issues on the LG Gram laptops (1) these patches are
ready to be merged now.

Regards,

Hans



1) And based on dmesg with a regressed kernel likely / hopefully also
the issue from RH bugzilla 2298938.





[Index of Archives]     [Linux IBM ACPI]     [Linux Power Management]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]
  Powered by Linux