Re: [Regression] 6.8 - i2c_hid_acpi: probe of i2c-XXX0001:00 failed with error -110

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 29.03.24 04:40, Philip Müller wrote:
> On 28/03/2024 00:45, Hans de Goede wrote:
>> On 3/27/24 4:47 PM, Philip Müller wrote:
>>> On 27/03/2024 17:31, Hans de Goede wrote:
>>>> On 3/19/24 5:52 AM, Philip Müller wrote:
>>>>> On 18/03/2024 17:58, Philip Müller wrote:
>>>>>> I'm currently developing on the OrangePi Neo-01, which ships with
>>>>>> two similar touchpads using the Synaptics driver. On 6.7.10 those
>>>>>> two devices get detected normally. On 6.8.1 it seems to error out.
>>>>>>
>>>>>> I either get none, or at best only one of those two devices.
>>>>>>
>>>>>> i2c_hid_acpi: probe of i2c-XXX0001:00 failed with error -110
>>>>>> i2c_hid_acpi: probe of i2c-XXX0002:00 failed with error -110
>>>>>>
>>>>>> what would be the best way to debug this?
>>>>>>
>>>>>
>>>>> I found the regression in commit
>>>>> aa69d6974185e9f7a552ba982540a38e34f69690
>>>>> HID: i2c-hid: Switch i2c_hid_parse() to goto style error handling

TWIMC: Kenny Levinsen (now CCed) posted a patch to fix problems
introduced by af93a167eda9:
https://lore.kernel.org/all/20240331182440.14477-1-kl@xxxxxx/

Looks a bit (but I might be wrong there!) like he ran into similar
problems as Philip, but was not aware of this thread.

Ciao, Thorsten

>>>> I just checked that patch and I don't see anyway how that can create
>>>> this regression. I assume you did a git bisect ?
>>>>
>>>> Did you try the last commit in the tree before that commit got added
>>>> and verified that that one works where as building a kernel from commit
>>>> aa69d6974185e9f itself does not work ?
>>>>
>>>
>>> No I didn't do a git bisect actually. I did it more manually like git
>>> bisect might do it. What I did was to compile just that module and
>>> load it against the 6.7.10 kernel.
>>> a9f68ffe1170ca4bc17ab29067d806a354a026e0 is the last commit you find
>>> in 6.7 series. All the other commits are part of the 6.8 series or
>>> later. So I looked at the commit titles and picked
>>> a9f68ffe1170ca4bc17ab29067d806a354a026e0 which I compiled the "out of
>>> tree" module. I unloaded the 6.7.10 module, copied the new compiled
>>> one over and did the rmod, modprobe batch script to see how the
>>> touchpads react.
>>>
>>> Then I went to compile and tested
>>> af93a167eda90192563bce64c4bb989836afac1f which created the issue with
>>> -110 errors. So I went to aa69d6974185e9f7a552ba982540a38e34f69690
>>> which also results in the error. So I went back to
>>> 96d3098db835d58649b73a5788898bd7672a319b which still works. Since now
>>> 6.9 development started, I may try
>>> 00aab7dcb2267f2aef59447602f34501efe1a07f to see if that fixes anything.
>>>
>>> I can get the touchpads both working in 6.8 code when I do enough
>>> tries by a rmod/modprobe loop. Based on the logs you see, sometimes
>>> both are up, mostly one or none is up. having only two commits of 6.7
>>> code in the module it works as 6.7 code. Only when "HID: i2c-hid:
>>> Switch i2c_hid_parse() to goto style error handling" gets added I see
>>> the error and can reproduce it on my device as shown in the log.
>>>
>>> Before we had
>>>
>>> hid_err(hid, "reading report descriptor failed\n");
>>> kfree(rdesc);
>>>
>>> followed by the hardcoded return -EIO;
>>>
>>> Now we have an redirect to the out function but don't do kfree(rdesc)
>>> if you hit the error anymore.
>>>
>>> Also the last return 0; got removed. So maybe what ever error the
>>> device might have hit before might be reverted due to return 0 or not
>>> freeing rdesc when hitting the error might prevent it to load.
>>>
>>> Anyhow, that patch is the first patch which changes the function in
>>> that series. Sure it might not be able to compile the whole kernel
>>> with it. I only tested the module based on the given code in an out
>>> of tree module style.
>>>
>>> So I might test 6.9rc1 on the device and let you know if that kernel
>>> has the same issue still.
>>>
>>>>> When I use the commit before I can rmmod and modprobe in a batch
>>>>> script using a loop without erroring out to -110. Attached the
>>>>> testing script and dmesg log snippets
>>>>>
>>>>> #!/bin/bash
>>>>> for ((n=0;n<5;n++))
>>>>> do
>>>>> sudo rmmod i2c_hid_acpi
>>>>> sleep 1
>>>>> sudo modprobe i2c_hid_acpi --force-vermagic
>>>>> sleep 2
>>>>> done
>>>>
>>>> Ok, so you did try the commit before and that did work. Are you
>>>> sure that aa69d6974185e9f was not actually the last working
>>>> commit ?
>>>
>>> For me 96d3098db835d58649b73a5788898bd7672a319b was the last working
>>> code, as it doesn't change functionality.
>>>
>>>>
>>>> AFAICT aa69d6974185e9f makes no functional changes, except for
>>>> actually propagating the error from i2c_hid_read_register()
>>>> rather then hardcoding -EIO. But that should not matter...
>>>
>>> See return 0 and kfree(rdesc) which is now missing.
>>
>> The new code is (on error) "goto out;" and then out: looks like this:
>>
>> out:
>>          if (!use_override)
>>                  kfree(rdesc);
>>
>>          return ret;
>>
>> Since the "goto out;" is an else branch of "if (use_override) { } else
>> {}"
>> we know use_override is false here, so !use_override is true and
>> the kfree() still happens. Also even if that were to no longer happen
>> that
>> would just be a small memleak and would not explain the probe failures
>> you are seeing.
>>
>> As for "return 0" vs "return ret", the code at the end of i2c_hid_parse()
>> before commit aa69d6974185e9f looked like this:
>>
>>         if (ret) {
>>                 dbg_hid("parsing report descriptor failed\n");
>>                 return ret;
>>         }
>>
>>         return 0;
>>
>> So this would also always return ret because if ret
>> is non 0 we always hit if (ret) return ret; and if
>> ret is 0 then return 0 and return ret are functionally
>> the same.
>>
>> So since the functionality at the end of the function did
>> not change, it still always ends up returning ret, but now
>> does it in a less circumspect way.
>>
>> Which leaves just the change in the:
>>
>>     hid_err(hid, "reading report descriptor failed\n");
>>
>> code path which used to return -EIO and now does goto out
>> which ends up returning ret which is the actual return
>> value of i2c_hid_read_register() instead of hardcoding -EIO.
>>
>> I checked and nothing in the HID core treats -EIO in
>> a special way.
>>
>> So I still do not see how commit aa69d6974185e9f can break
>> anything and I really wonder if you did not make a mistake
>> during testing.
>>
>> Can you try building a clean broken kernel (the entire
>> kernel) and then in reverse order (of them being committed)
>> revert my recent i2c-hid changes 1 by one and after each
>> revert check if things start working again ?
>>
>> And then see which is the first revert after which things
>> start working again, that one is the culprit and I expect
>> you to find a different commit then aa69d6974185e9f.
>>
>> Unless I'm missing / overlooking something in commit
>> aa69d6974185e9f...
>>
>> Note I'm going offline for a long weekend and I won't be
>> replying to emails until next week Tuesday.
>>
>> Regards,
>>
>> Hans
> 
> I managed to do a brief test with Kernel 6.9-rc1. There the issue is
> still given. So I might add the changes of i2c-hid to the working 6.7.x
> kernels and post which commits will break that module.
> 




[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux