Hi,
There have been quite a few reports about failed keyboard initialization
on some 9 types of Lenovo Yoga / XiaoXinPro / IdeaPad (14", Intel)
laptops. A list of them can be found here:
https://github.com/yescallop/atkbd-nogetid#presumably-supported-machines
And a related kernel bug report can be found here:
https://bugzilla.kernel.org/show_bug.cgi?id=216994
I'd like to first provide a dmesg log (without patch) that illustrates
the problem on my Yoga 14sIHU 2021:
https://gist.github.com/yescallop/5a97d010f226172fafab0933ce8ea8af
At first the KBD port was successfully set up by `i8042`, but then the
first initialization attempt by `atkbd` failed:
[ 2.698474] i8042: [17] f2 -> i8042 (kbd-data)
[ 2.698678] i8042: [17] fa <- i8042 (interrupt, 0, 1)
[ 2.698746] i8042: [17] 83 <- i8042 (interrupt, 0, 1)
[ 2.698767] i8042: [17] 60 -> i8042 (command)
[ 2.698856] i8042: [17] 66 -> i8042 (parameter)
[ 2.698951] i8042: [17] 60 -> i8042 (command)
[ 2.699092] i8042: [17] 67 -> i8042 (parameter)
It seems that the i8042 implementation on the laptop omitted the `0xab`
byte from its response to the `GETID` command, thus making the
`atkbd_probe` function fail for receiving an invalid keyboard ID (should
normally be `0xab 0x83`).
This situation went on for a few rounds when I pressed and released the
space key (scan code: 0x39 when pressed, 0xb9 when released). The sixth
time I pressed the space key, something different happened:
[ 48.188540] i8042: [13664] 39 <- i8042 (interrupt, 0, 1)
[ 48.188658] i8042: [13664] f2 -> i8042 (kbd-data)
[ 48.188998] i8042: [13664] fa <- i8042 (interrupt, 0, 1)
[ 48.709743] i8042: [13821] ed -> i8042 (kbd-data)
[ 48.913069] i8042: [13882] 60 -> i8042 (command)
[ 48.913235] i8042: [13882] 66 -> i8042 (parameter)
[ 48.913446] i8042: [13882] 60 -> i8042 (command)
[ 48.913591] i8042: [13882] 67 -> i8042 (parameter)
[ 48.913672] i8042: [13882] fa <- i8042 (interrupt, 0, 0)
This time even the byte `0x83` was omitted, so the `GETID` command
failed and `atkbd_probe` tried to set the LEDs on the keyboard, but
failed again for not receiving an ACK to the command byte `0xed`.
However, when `i8042_port_close` was later called, an ACK was read from
the KBD port, which is an indication that the i8042 implementation might
have failed to raise an interrupt for this ACK.
And the next time I released the space key, the byte `0x83` was omitted
again, but `atkbd_probe` somehow succeeded in receiving an ACK to the
`SETLEDS` command, and the keyboard was finally initialized properly.
An easy workaround is to add a kernel parameter `i8042.dumbkbd` in the
boot loader, but as this makes the Caps Lock LED unusable, some other
solutions should be considered when it comes to patching the kernel.
Here I provide two possible solutions:
1. Add a module parameter in `atkbd`, say `assume_normal_kbd`, that,
when set to true, makes `atkbd_probe` skip sending the `GETID`
command and set the keyboard ID directly to `0xab83`. Then, add
quirks to make it a default for the affected machines.
2. In `atkbd_probe`: Call `i8042_flush` immediately after the `GETID`
command is finished, to get rid of any remaining byte in the
keyboard buffer that is not properly signaled by an interrupt. Then,
if the command failed or the keyboard ID is invalid, try to set the
LEDs on anything connected to the KBD port.
I have tried both solutions and both worked nicely on my laptop, but
there might be some problems with them:
* For the first solution: Do we add a module parameter, quirks, or
both? I find that `i8042.probe_defer` is an example for adding both
of them, and `atkbd_skip_deactivate` for adding only quirks.
* For the second solution: Is it okay to flush all data in the
keyboard/mouse buffer down the toilet from this particular call site
for all machines? I suspect some special handling is required for
not flushing the data in the mouse buffer but instead sending them
to the upper layers.
* For the second solution: Will it do any harm to persistently try to
set the LEDs on a mouse connected to the KBD port? A comment in
`atkbd_probe` says "If a mouse is connected, this should make sure
we don't try to set the LEDs on it." I'm not at all familiar to PS/2
devices, so can someone maybe explain this a bit?
The second solution is, indeed, a general one that may automatically fix
similar problems on other machines, without needing to manually add
quirks in `atkbd`. A example of a similar problem on HP Spectre x360
13-aw2xxx is described here:
https://patchwork.kernel.org/project/linux-input/patch/20210201160336.16008-1-anton@xxxxxx/
But it can nevertheless cause regressions if not thoroughly considered.
Thus, I'm here seeking for your directions on a workaround for this
problem, as there can be some better solution that I'm not aware of.
Thanks,
Shang