On Fri, May 28, 2021 at 05:02:53PM +0300, Egor Ignatov wrote: Hello Egor, > I have a problem with the PS/2 keyboard on an HP laptop > (15s-fq2020ur). The problem is that after booting the > system, the keyboard does not work. But it starts working > about 10 seconds after pressing any key. > > I looked at the i8042 log and it seems to me that the > problem is that the driver does not wait for a response to > the GETID. It receives ACK and immediately sends the > 0xed command without waiting for ID. Actually, that's not the case if you look at the logs: Here we send the GETID command > [ 0.460964] i8042: [1] f2 -> i8042 (kbd-data) And here we get the ACK for the command back, 10ms later. > [ 0.471708] i8042: [12] fa <- i8042 (interrupt, 0, 1) Here we wait for half a second, as you can see from the timestamps, and nothing arrives. No ID data from the keyboard at all. So here we see that the GETID command timed out, so we try a backup plan. Some very old keyboards don't support GETID, so we try SETLEDS, which every keyboard should support. > [ 0.977581] i8042: [518] ed -> i8042 (kbd-data) [ .... crickets .... ] There is no answer at all. We should at least get an 'fa' response here, so that we can send the parameter of the command. We wait for another 800ms and nothing at all arrives. And so the atkbd_probe() function gives up and returns failure. Then it's back to i8042.c's i8042_port_close(); And it issues the WCTR command with 0x64 as a parameter, to disable the keyboard IRQ (dropping KBDINT = 1). > [ 1.185586] i8042: [726] 60 -> i8042 (command) > [ 1.185686] i8042: [726] 64 -> i8042 (parameter) And then i8042.c enables the interrupt again, to look for hotplug (setting KBDINT=1): > [ 1.185842] i8042: [726] 60 -> i8042 (command) > [ 1.185935] i8042: [726] 65 -> i8042 (parameter) And oh wow, once we kicked the controller by toggling the interrupt disable/enable, see what's coming in! The GETID response! > [ 1.185975] i8042: [726] ab <- i8042 (interrupt, 0, 0) But something is suspicious here, the "0, 0". The last number is the interrupt number and the KBD port always uses IRQ1. So this comes from manually checking the port for waiting data by calling i8042_interrupt(0, NULL); at the end of i8042_port_close(). And the controller that got stuck after the GETID command is unstuck again and properly generates an interrupt for the 2nd byte of the GETID response: > [ 1.189909] i8042: [730] 83 <- i8042 (interrupt, 0, 1) Yay, we got that. Now an incoming byte on the KBD port triggers a hotplug event, we think there may be a new keyboard plugged in. So we repeat the detection sequence of atkbd again, sending the GETID command: > [ 1.189952] i8042: [730] f2 -> i8042 (kbd-data) And we get a proper ACK response: > [ 1.200096] i8042: [740] fa <- i8042 (interrupt, 0, 1) But what the hell, there is one more ACK coming that shouldn't have: > [ 1.204012] i8042: [744] fa <- i8042 (interrupt, 0, 1) So we bail out. An ID of 0xfa is not a keyboard! Back to i8042.c, we toggle the interrupt enable bit: > [ 1.204031] i8042: [744] 60 -> i8042 (command) > [ 1.204124] i8042: [744] 64 -> i8042 (parameter) > [ 1.204272] i8042: [744] 60 -> i8042 (command) > [ 1.204364] i8042: [744] 65 -> i8042 (parameter) But there's nothing waiting for us, so nothing else is happening. > At this point it doesn't do anything until you press a key. > Then the driver starts sending GETID repeatedly until at > some point it gets the correct answer, after which the > keyboard starts working. As I sad it takes about 10 secs. > > Here is a part of the log after pressing a key: > > [ 11.103249] i8042: [10643] 1d <- i8042 (interrupt, 0, 1) Indeed, a keypress means new bytes coming in, so this is a new hotplug event - and we try to detect if there is a keyboard: > [ 11.103287] i8042: [10643] f2 -> i8042 (kbd-data) > [ 11.113673] i8042: [10654] fa <- i8042 (interrupt, 0, 1) > [ 11.113719] i8042: [10654] ab <- i8042 (interrupt, 0, 1) And something goes awry again. We're supposed to get 'fa ab 83', not just 'fa ab'. So we wait and timeout 0.5 seconds later. We fall back to trying the SETLED command again. > [ 11.617485] i8042: [11158] ed -> i8042 (kbd-data) And we don't even get an ACK. The keyboard controller is stuck again. Ouch. > [ 11.825485] i8042: [11366] 60 -> i8042 (command) > [ 11.825778] i8042: [11366] 64 -> i8042 (parameter) > [ 11.825924] i8042: [11366] 60 -> i8042 (command) > [ 11.826016] i8042: [11366] 65 -> i8042 (parameter) So we're back in closing the port in i8042.c. We toggled the line, and we check for any data in the data port: > [ 11.826049] i8042: [11366] 83 <- i8042 (interrupt, 0, 0) Yes, like before, the 0x83 was waiting there for us and was blocking the data port for any further communication. > [ 11.830084] i8042: [11370] fa <- i8042 (interrupt, 0, 1) And another ACK was waiting there, too, probably from the SETLEDs command. This time, however, we're lucky and manage to read the ACK before we start reinitializing the keyboard. So we send a GETID: > [ 11.830107] i8042: [11370] f2 -> i8042 (kbd-data) Get an ACK: > [ 11.840241] i8042: [11380] fa <- i8042 (interrupt, 0, 1) And this I don't even have an idea where is coming from. Possibly still the keypress ... ? > [ 11.844063] i8042: [11384] 38 <- i8042 (interrupt, 0, 1) Nevertheless, it's not a valid ID, so we bail out again. We toggle the interrupt pin. > [ 11.844083] i8042: [11384] 60 -> i8042 (command) > [ 11.844174] i8042: [11384] 64 -> i8042 (parameter) > [ 11.844320] i8042: [11384] 60 -> i8042 (command) > [ 11.844413] i8042: [11384] 65 -> i8042 (parameter) And this time there is no data stuck there. But some comes later via the normal interrupt way (still no idea what the keybaord is trying to tell us, maybe more keypresses): > [ 11.849039] i8042: [11389] 3c <- i8042 (interrupt, 0, 1) And we try to identify the keyboard .... > [ 11.849059] i8042: [11389] f2 -> i8042 (kbd-data) > [ 11.859198] i8042: [11399] fa <- i8042 (interrupt, 0, 1) > [ 12.361490] i8042: [11902] ed -> i8042 (kbd-data) > ... > [ 27.516138] i8042: [27455] f2 -> i8042 (kbd-data) > [ 27.526395] i8042: [27466] fa <- i8042 (interrupt, 0, 1) > [ 27.531044] i8042: [27471] fa <- i8042 (interrupt, 0, 1) > [ 27.531080] i8042: [27471] 60 -> i8042 (command) > [ 27.531183] i8042: [27471] 64 -> i8042 (parameter) > [ 27.531336] i8042: [27471] 60 -> i8042 (command) > [ 27.531713] i8042: [27471] 65 -> i8042 (parameter) > [ 27.536215] i8042: [27476] 1d <- i8042 (interrupt, 0, 1) > **HERE IT FINALLY RECEIVES THE CORRECT RESPONSE** And indeed, later the sequence finally succeeds: > [ 27.536290] i8042: [27476] f2 -> i8042 (kbd-data) > [ 27.546882] i8042: [27487] fa <- i8042 (interrupt, 0, 1) > [ 27.546940] i8042: [27487] ab <- i8042 (interrupt, 0, 1) > [ 27.546997] i8042: [27487] 83 <- i8042 (interrupt, 0, 1) We get the correct ID and we proceed to RESET_DIS to prevent any keypresses messing up our further communication with the keyboard: > [ 27.547018] i8042: [27487] f5 -> i8042 (kbd-data) > [ 27.557566] i8042: [27497] fa <- i8042 (interrupt, 0, 1) We then turn the LEDs off: > [ 27.557615] i8042: [27497] ed -> i8042 (kbd-data) > [ 27.568242] i8042: [27508] fa <- i8042 (interrupt, 0, 1) > [ 27.568294] i8042: [27508] 00 -> i8042 (kbd-data) > [ 27.578730] i8042: [27518] fa <- i8042 (interrupt, 0, 1) Set the repeat rate: > [ 27.578785] i8042: [27518] f3 -> i8042 (kbd-data) > [ 27.589151] i8042: [27529] fa <- i8042 (interrupt, 0, 1) > [ 27.589206] i8042: [27529] 00 -> i8042 (kbd-data) > [ 27.599602] i8042: [27539] fa <- i8042 (interrupt, 0, 1) And finally enable the keyboard for use. > [ 27.599676] i8042: [27539] f4 -> i8042 (kbd-data) > [ 27.609986] i8042: [27550] fa <- i8042 (interrupt, 0, 1) > > Any idea what to do about this? So it's not the problem that the driver would not be waiting for a GETID answer. It actually waits for a long long time. It's the virtual i8042 keyboard controller implemented in the BIOS that has an issue, not properly delivering interrupts when the keyboard sends three bytes (fa ab 83) in a too quick succession. You can try experimenting with the 'noaux', 'nomux' and 'dumbkbd', and 'kbdreset' options of i8042, and also the 'reset' option of 'atkbd'. This will change the init sequence and there is a chance it'll stop tickling the virtual i8042 controller in the laptop the wrong way. If that helps, there is a quirk table in i8042 to enable these options based on the EDID data of the laptop automatically. If it doesn't help, then we'd need to find a workaround how to recover from the lost IRQ situation without giving up on keyboar detection. Possibly by signalling the detection timeout from atkbd.c back to i8042.c to check for a stuck byte in the queue. Vojtech -- Vojtech Pavlik VP Linux Systems Group, SUSE