Hi Rafael, On 6/4/24 14:07, Rafael J. Wysocki wrote: > On Mon, Jun 3, 2024 at 8:20 PM Rafael J. Wysocki <rafael@xxxxxxxxxx> > wrote: >> >> Hi, >> >> On Fri, May 31, 2024 at 1:35 PM Laura Nao <laura.nao@xxxxxxxxxxxxx> >> wrote: >>> >>> Hello, >>> >>> On 5/30/24 17:37, Laura Nao wrote: >>>> Hello, >>>> >>>> We have identified a regression in the acpi-fan driver probe >>>> between >>>> v6.9-rc7 and v6.10-rc1 on some Intel Chromebooks in the Collabora >>>> LAVA >>>> lab. >>>> >>>> For the Acer Chromebook Spin 514 (CP514-2H), the following error is >>>> reported in the logs: >>>> >>>> [ 0.651202] acpi-fan INTC1044:00: probe with driver acpi-fan >>>> failed with error -22 >>>> >>>> Similar errors are reported on other devices with fans compatible >>>> with >>>> the same driver. >>>> >>>> On Acer Chromebox CXI4, ASUS Chromebook Flip C436FA and >>>> HP Chromebook x360 14 G1: >>>> >>>> [ 0.488001] acpi-fan INT3404:00: probe with driver acpi-fan >>>> failed with error -22 >>>> >>>> On ASUS Chromebook Vero 514 CBV514-1H: >>>> >>>> [ 1.168905] acpi-fan INTC1048:00: probe with driver acpi-fan >>>> failed with error -22 >>>> >>>> The issue is still present on next-20240529. >>>> >>>> I'm sending this report to track the regression while a fix is >>>> identified. I'll investigate the issue/run a bisection and report >>>> back >>>> with the results. >>>> >>>> This regression was discovered during some preliminary tests with >>>> the >>>> ACPI probe kselftest [1] in KernelCI. The config used was the >>>> upstream >>>> x86_64 defconfig with a fragment applied on top [2]. >>>> >>>> Best, >>>> >>>> Laura >>>> >>>> [1] >>>> https://lore.kernel.org/all/20240308144933.337107-1-laura.nao@xxxxxxxxxxxxx/ >>>> [2] https://pastebin.com/raw/0tFM0Zyg >>>> >>>> #regzbot introduced: v6.9-rc7..v6.10-rc1 >>> >>> The issue started happening after: >>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/thermal/thermal_core.c?h=v6.10-rc1&id=31a0fa0019b022024cc082ae292951a596b06f8c >>> >>> Before this commit, get_cur_state() was not called by >>> __thermal_cooling_device_register, so the error was not triggered. >>> >>> After enabling debugging for the acpi-fan driver, I noticed these >>> errors >>> in the logs: >>> >>> [ 0.682224] acpi INTC1044:00: Invalid control value returned >>> [ 0.682635] acpi INTC1044:00: Invalid control value returned >>> >>> The value stored in fst.control is 255, which is indeed not a valid >>> value. >>> >>> I suspect this might be a firmware issue that is now manifesting due >>> to >>> the addition of the extra get_cur_state() call. >>> >>> I'll dig a bit more and report back. >> >> It looks like _FST returns all ones if it is evaluated before _FSL on >> the affected platforms. >> Right, I'll look into that. >> It shouldn't do that, but then it is not particularly useful to fail >> cdev registration for this reason. >> >> The attached patch should work around this issue, please try it and >> report back. > > A !ret check is missing in that patch, so please try the attached new > version of it instead. > > Thanks! I confirm the patch works as expected and fixes the probe issue. Thank you! Laura