Re: "EC: Install address space handler at the namespace root" causing issues for some users

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On 8/9/24 8:02 PM, Armin Wolf wrote:
> Am 09.08.24 um 15:17 schrieb Rafael J. Wysocki:
> 
>> On Thu, Aug 8, 2024 at 7:22 PM Rafael J. Wysocki <rafael@xxxxxxxxxx> wrote:
>>> Hi,
>>>
>>> On Mon, Aug 5, 2024 at 2:47 PM Hans de Goede <hdegoede@xxxxxxxxxx> wrote:
>>>> Hi,
>>>>
>>>> On 8/5/24 1:28 PM, Hans de Goede wrote:
>>>>> Hi,
>>>>>
>>>>> On 8/1/24 4:28 PM, Hans de Goede wrote:
>>>>>> Hi,
>>>>>>
>>>>>> On 7/29/24 1:15 PM, Hans de Goede wrote:
>>>>>>> Hi Rafael,
>>>>>>>
>>>>>>> There are 2 bug reports:
>>>>>>>
>>>>>>> 1. Brightness up/down key-presses no longer working on LG laptop (acpi-video related):
>>>>>>> https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx/thread/V2KWAGZIAX4TOWPCH6A6FSIT66PR3KMZ/
>>>>>> I have filed:
>>>>>>
>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=2302253
>>>>>>
>>>>>> to track this now and an acpidump of the troublesome LG laptop
>>>>>> is attached there. I have also requested dmesg output of
>>>>>> a non working kernel to be attached there.
>>>>>>
>>>>>> As a reminder this is the bug where it has been confirmed that
>>>>>> reverting "EC: Install address space handler at the namespace root"
>>>>>> helps, with the caveat that there is a Thunderbolt related IRQ
>>>>>> storm on the ACPI event IRQ after the revert ...
>>>>> Ok, so the bugzilla now has 2 different dmesg outputs:
>>>>>
>>>>> 1. 6.9.6, this kernel works without problems
>>>>>
>>>>> 2. 6.9.12 with the following patch you suggested on top:
>>>>>
>>>>> --- a/drivers/acpi/ec.c
>>>>> +++ b/drivers/acpi/ec.c
>>>>> @@ -1788,7 +1788,7 @@ void __init acpi_ec_dsdt_probe(void)
>>>>>         * At this point, the GPE is not fully initialized, so do not to
>>>>>         * handle the events.
>>>>>         */
>>>>> -     ret = acpi_ec_setup(ec, NULL, true);
>>>>> +     ret = acpi_ec_setup(ec, NULL, false);
>>>>>        if (ret) {
>>>>>                acpi_ec_free(ec);
>>>>>                return;
>>>>>
>>>>> Unfortunately this does not help. dmesg shows some EC _REG errors, which
>>>>> are now (with the above diff applied) shown just before the
>>>>> "Boot DSDT EC initialization complete" message, which shows that _REG now
>>>>> runs from acpi_ec_add() rather then before:
>>>>>
>>>>> [    1.007566] ACPI BIOS Error (bug): Could not resolve symbol [\_TZ.FN00._OFF], AE_NOT_FOUND (20230628/psargs-330)
>>>>> [    1.007576] ACPI Error: Aborting method \_SB.PC00.LPCB.H_EC.EREG due to previous error (AE_NOT_FOUND) (20230628/psparse-52
>>>>> [    1.007580] ACPI Error: Aborting method \_SB.PC00.LPCB.H_EC._REG due to previous error (AE_NOT_FOUND) (20230628/psparse-52
>>>>> [    1.007639] ACPI: EC: interrupt unblocked
>>>>> [    1.007640] ACPI: EC: event unblocked
>>>>> [    1.007675] ACPI: EC: EC_CMD/EC_SC=0x66, EC_DATA=0x62
>>>>> [    1.007676] ACPI: EC: GPE=0x6e
>>>>> [    1.007677] ACPI: \_SB_.PC00.LPCB.LGEC: Boot DSDT EC initialization complete
>>>>> [    1.007679] ACPI: \_SB_.PC00.LPCB.LGEC: EC: Used to handle transactions and events
>>>>>
>>>>> Note that the errors are from calling _REG on \_SB.PC00.LPCB.H_EC, where as the actual
>>>>> EC (and the only acpi_device on which _REG would get called for the EC Opregion before) is:
>>>>> \_SB_.PC00.LPCB.LGEC.
>>>>>
>>>>> Looking at the DSDT it seems that the H_EC is not used and is leftover from a copy/paste
>>>>> from some reference design DSDT. Its _REG however does write to the EC before hitting the error
>>>>> and I think that that write may be causing the issue...
>>>>>
>>>>> The H_EC device does have an _STA method and looking closer the troublesome EREG method is
>>>>> also called from _INI. So I guess that _STA is returning 0 causing _INI to not run and
>>>>> that is the reason why we are not seeing the same EREG errors with kernel 6.9.6 where _REG is
>>>>> only called for the EC opregion on \_SB_.PC00.LPCB.LGEC and not for the entire ACPI device
>>>>> hierarchy as is done with >= 6.9.7 .
>>>>>
>>>>> Maybe we should only call _REG for the EC opregion on present devices (and devices without
>>>>> a _STA)?
>>>>>
>>>>> Also note that both LGEC and H_EC use the same cmd + data ports.
>>>>>
>>>>> I'll go and ask the reporter to retrieve the status of both LGEC and H_EC and then see
>>>>> from there.
>>>> The reporter has confirmed that of the 2 EC devices ( H_EC / LGEC ) only LGEC returns
>>>> a _STA of non 0:
>>>>
>>>>> Here it is, with kernel 6.9.6:
>>>>>
>>>>> $ cat /sys/bus/acpi/devices/PNP0C09\:00/path
>>>>> \_SB_.PC00.LPCB.H_EC
>>>>> $ cat /sys/bus/acpi/devices/PNP0C09\:00/status
>>>>> 0
>>>>> $ cat /sys/bus/acpi/devices/PNP0C09\:01/path
>>>>> \_SB_.PC00.LPCB.LGEC
>>>>> $ cat /sys/bus/acpi/devices/PNP0C09\:01/status
>>>>> 15
>>>> And taking a second look at the other bug:
>>>> https://bugzilla.redhat.com/show_bug.cgi?id=2298938
>>>>
>>>> That one also has 2 EC ACPI devices and the errors come from calling _REG on the one
>>>> which is not picked as the boot_ec :
>>>>
>>>> jul 19 17:33:41 kernel: ACPI: EC: EC started
>>>> jul 19 17:33:41 kernel: ACPI: EC: interrupt blocked
>>>> jul 19 17:33:41 kernel: ACPI Error: Aborting method \_SB.PCI0.LPCB.H_EC.ECMD due to previous error (AE_AML_LOOP_TIMEOUT) (20230628/psparse-529)
>>>> jul 19 17:33:41 kernel: ACPI Error: Aborting method \_TZ.FNCL due to previous error (AE_AML_LOOP_TIMEOUT) (20230628/psparse-529)
>>>> jul 19 17:33:41 kernel: ACPI Error: Aborting method \_TZ.FN00._OFF due to previous error (AE_AML_LOOP_TIMEOUT) (20230628/psparse-529)
>>>> jul 19 17:33:41 kernel: ACPI Error: Aborting method \_SB.PCI0.LPCB.H_EC._REG due to previous error (AE_AML_LOOP_TIMEOUT) (20230628/psparse-529)
>>>> jul 19 17:33:41 kernel: ACPI: EC: EC_CMD/EC_SC=0x66, EC_DATA=0x62
>>>> jul 19 17:33:41 kernel: ACPI: \_SB_.PCI0.LPCB.EC0_: Boot DSDT EC used to handle transactions
>>>>
>>>> Note the H_EC vs EC0_ in the errors vs the "Boot DSDT EC used to
>>>> handle transactions" message.
>>>>
>>>> So the issue in both cases seems to be calling _REG on an unused EC acpi_device.
>>>> Not sure how to best fix this though ...
>>> I have created an experimental acpi-ec-fixes branch:
>>>
>>> https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git/log/?h=acpi-ec-fixes
>>>
>>> for this which illustrates my idea (untested so far).
>> This has just been updated to actually register the EC _REG for all
>> devices (which would have been skipped for the majority of them due to
>> the previous location of the relevant function call).
> 
> Hi,
> 
> i will ask the reporter https://bugzilla.kernel.org/show_bug.cgi?id=219075 to try out your patches.
> His LG Gram seems to be affected by this problem (it has two overlapping EC devices ...).

Ah another reporter with an LG Gram, this Fedora bug:

https://bugzilla.redhat.com/show_bug.cgi?id=2302253

is also about a LG Gram. And shows the same:

[    0.239044] ACPI: EC: EC started
[    0.239045] ACPI: EC: interrupt blocked
[    0.239787] ACPI BIOS Error (bug): Could not resolve symbol [\_TZ.FN00._OFF], AE_NOT_FOUND (20240322/psargs-330)
[    0.239794] fbcon: Taking over console
[    0.239800] ACPI Error: Aborting method \_SB.PC00.LPCB.H_EC.EREG due to previous error (AE_NOT_FOUND) (20240322/psparse-
[    0.239805] ACPI Error: Aborting method \_SB.PC00.LPCB.H_EC._REG due to previous error (AE_NOT_FOUND) (20240322/psparse-
[    0.239820] ACPI: EC: EC_CMD/EC_SC=0x66, EC_DATA=0x62
[    0.239822] ACPI: \_SB_.PC00.LPCB.LGEC: Boot DSDT EC used to handle transactions

errors in their dmesg with regressed kernels.

So this seems to be the same issue.

Regards,

Hans




> 
> Thanks,
> Armin Wolf
> 
>>> The underlying observation is that _REG only needs to be evaluated for
>>> EC operation regions located in the scopes of ACPI device objects
>>> representing valid devices, so it is better to do it for each of these
>>> objects individually in acpi_bus_attach().
>>>
>>> For the EC itself, it is better to do what was done before the
>>> $subject commit, so evaluate _REG for the EC operation regions in the
>>> EC scope (including the "orphan" _REG).
>>>
>>> Accordingly, commit 0e6b6dedf168 ("Revert "ACPI: EC: Evaluate orphan
>>> _REG under EC device") is reverted, acpi_execute_reg_methods() is
>>> modified to take an additional depth argument and it is called for
>>> each device object representing a valid device with that argument
>>> equal to 1.
> 





[Index of Archives]     [Linux IBM ACPI]     [Linux Power Management]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]
  Powered by Linux