Hi,
On 22-05-18 01:58, wereturtle wrote:
By the way,
If this indeed is an Intel system one thing worth trying is
changing the clk_freq we use in the i2c-designware driver to
calculate clk high and low counts. There are some reports from
people attaching scopes to the i2c wires that on kabylake at
least the driver is driving the bus about 1.5 times the
expected rate (so 600KHz instead of 400KHz).
If I theoretically ran into the into the issue with the i2c interrupts
driving my CPU cores up to 100% once again in the future, what could I
do to work around it? Would blacklisting the i2c-designware driver
fix that? Would that have negative side effects for other things? Is
there a kernel parameter to make the gpio i2c not send so many
interrupts?
Blacklisting the i2c-designware driver should do the trick.
Regards,
Hans
I just want to make sure a kernel update in the future does not brick
the laptop, as by the time the patch officially arrives in my distro I
would be well outside my return window with Best Buy. (I'm not
discounting Jarkko's ability to make a smooth patch, but this is for a
doomsday scenario.)
Thanks!
On Mon, May 21, 2018 at 12:56 PM, wereturtle <wereturtledev@xxxxxxxxx> wrote:
I would expect 4.18 given that 4.17 is more or less a done deal, but
once the patch is out I expect it to also be cherry-picked as a bug-fix
to the stable releases of older kernels.
Sounds great! Thanks, Hans! And thank you, to all of you!
On Mon, May 21, 2018 at 11:12 AM, Hans de Goede <hdegoede@xxxxxxxxxx> wrote:
Hi,
On 21-05-18 19:02, wereturtle wrote:
Hi Hans,
Have you tried turning off the computer and removing the battery,
then wait 5 minutes and put the battery back again?
OK, I had to run to the store to get the right size Torx screwdriver,
but I removed the battery and waited 15 minutes before putting it back
in. Unfortunately, that didn't help.
Likely having the touchpad properly working results in either the
touchpad
or the i2c controller firing an interrupt at boot because of state left
over from the previous boot with working touchpad. Since you now lack
a working driver, nothing acks the interrupt and it keeps firing.
Interesting! Thanks for teaching me!
The proper solution here would be to build 4.15 with the fix, or
see if there are some patches to the nvidia driver to make it build
with 4.17, which avoids the need for another kernel build.
Alas, I've had too much difficulty with the latter solutions. 4.17
vanilla (no patch) seems to ack it already. 4.15 would need something
else to ack it in the first place, and I'm not sure where to patch
that. I couldn't find any Nvidia patches for 4.17.
Fortunately, Best Buy let me return the laptop without any fuss. They
were even offering to exchange it. I declined, but I might try again
and live without the touchpad for a time since the laptop just went on
sale this morning.
As such, when, roughly, do you think the official patch will land?
Kernel 4.17 or 4.18? (Supposedly Ubuntu 18.10 will have 4.18, if the
stars align.)
I would expect 4.18 given that 4.17 is more or less a done deal, but
once the patch is out I expect it to also be cherry-picked as a bug-fix
to the stable releases of older kernels.
Jarkko, I did not see an official patch for this yet, I'm not on the
list though, so I don't know if was not posted at all, or if you did
not Cc me? (not Cc-ing me is fine I'm only sideways involved).
Regards,
Hans
Thanks!
On Sun, May 20, 2018 at 3:13 AM, Hans de Goede <hdegoede@xxxxxxxxxx>
wrote:
Hi,
On 20-05-18 04:34, wereturtle wrote:
Some bad news to follow up the good news.
Installing the patched Kernel for my touchpad had a negative side
effect. While running the patched Kernel, I didn't have any issues.
However, I couldn't get the Nvidia driver to install with this Kernel.
As such, I tried rebooting into my old 4.15 Kernel. Even after
removing the patched Kernel and reinstalling the Nvidia driver several
times for 4.15, my computer became sluggish during browsing, typing,
etc. Games were locking up or else having a huge framerate drop. My
CPU cores were spinning like crazy without even any processes taking
up CPU.
Further investigation revealed a an "unexpected IRQ trap at vector 9a"
error message at startup and shutdown in the console. The message
fires constantly. Under /proc/interrupts, it was listing intel-gpio
(i2c) for 9a. It was firing off like crazy. I think that's for my
touchpad?
I tried reinstalling Kubuntu altogether, twice, and it wouldn't stop.
It's like that patch permanently wrote something to my hardware? I
tried installing 4.17 RC5 with Ukuu without the touchpad patch built
in. The intel-gpio interrupts went away, and the computer is snappy
and responsive again. However, rebooting back into 4.15 resulted in
the interrupts returning.
How did this patch end up doing something permanently to my computer,
and what can I do to undo it for Kernel 4.15 so that I can use my
Nvidia drivers again?
Have you tried turning off the computer and removing the battery,
then wait 5 minutes and put the battery back again?
Likely having the touchpad properly working results in either the
touchpad
or the i2c controller firing an interrupt at boot because of state left
over from the previous boot with working touchpad. Since you now lack
a working driver, nothing acks the interrupt and it keeps firing.
The proper solution here would be to build 4.15 with the fix, or
see if there are some patches to the nvidia driver to make it build
with 4.17, which avoids the need for another kernel build.
Regards,
Hans
Also, I don't notice this same sluggishness with Windows 10. Windows
is snappy regardless of the Kernel.
On Sat, May 19, 2018 at 12:42 PM, wereturtle <wereturtledev@xxxxxxxxx>
wrote:
Hi everyone!
I set the clk_rate to be 216000000 in my own patched Kernel 4.17. RC
5, and my touchpad now works!
Thank you so much!
On Fri, May 18, 2018 at 12:39 AM, Hans de Goede <hdegoede@xxxxxxxxxx>
wrote:
Hi,
On 18-05-18 09:32, Hans de Goede wrote:
Hi,
On 17-05-18 20:14, Dmitry Torokhov wrote:
On Thu, May 17, 2018 at 2:36 AM, Benjamin Tissoires
<benjamin.tissoires@xxxxxxxxx> wrote:
Scope (_SB.PCI0.I2C1)
{
Device (ETPD)
{
Name (SBFB, ResourceTemplate ()
{
I2cSerialBusV2 (0x004C, ControllerInitiated,
0x00061A80,
AddressingMode7Bit, "\\_SB.PCI0.I2C1",
0x00, ResourceConsumer, _Y34, Exclusive,
)
})
Name (SBFI, ResourceTemplate ()
{
Interrupt (ResourceConsumer, Level, ActiveHigh,
Exclusive, ,, )
{
0x0000005F,
}
})
...
So nothing scary, the interrupt is a plain interrupt, not a GPIO. I
guess the issue lies in i2c-designware and the AMD
implementation...
Also, in dmesg we have:
[ 25.020612] cannonlake-pinctrl INT3450:00: pin 26 cannot be used
as
IRQ
[ 25.020615] genirq: Setting trigger mode 3 for irq 137 failed
(intel_gpio_irq_type+0x0/0x140)
[ 25.023113] intel-lpss 0000:00:15.1: enabling device (0000 ->
0002)
[ 25.023336] idma64 idma64.1: Found Intel integrated DMA 64-bit
[ 25.025326] i2c_hid i2c-ELAN1201:00: i2c-ELAN1201:00 supply vdd
not
found, using dummy regulator
[ 25.025494] i2c_designware i2c_designware.1:
i2c_dw_handle_tx_abort: lost arbitration
[ 25.025652] i2c_designware i2c_designware.1:
i2c_dw_handle_tx_abort: lost arbitration
[ 25.025811] i2c_designware i2c_designware.1:
i2c_dw_handle_tx_abort: lost arbitration
[ 25.025970] i2c_designware i2c_designware.1:
i2c_dw_handle_tx_abort: lost arbitration
[ 25.025972] i2c_hid i2c-ELAN1201:00: hid_descr_cmd failed
0x5F is kind of high for a plain interrupt; I wonder if ACPI table
relies on static gpio->virq mapping that could be different on
Linux... Also I am surprised the IRQ is active-HIGH, normally it is
active low. Might want to try and hack the driver to force it to low
and see what happens...
Yes the interrupt is definitely suspect. Actually using plain
interrupts
rather then a GpioInt is something which I would only expect to see
in
old DSDTs and not in recent ones, because for i2c devices there is
no clear parent interrupt controller and as such no well defined way
to
properly interpret a raw Interrupt number.
What is with the AMD reference btw, the above dmesg snippet looks
to be about an Intel system? I would not expect cannonlake-pinctrl
to be used on an AMD system...
If this indeed is an Intel system one thing worth trying is
changing the clk_freq we use in the i2c-designware driver to
calculate clk high and low counts. There are some reports from
people attaching scopes to the i2c wires that on kabylake at
least the driver is driving the bus about 1.5 times the
expected rate (so 600KHz instead of 400KHz).
A workaround for now would be to edit:
drivers/mfd/intel-lpss-pci.c
And change clk_rate in:
static const struct intel_lpss_platform_info spt_i2c_info = {
.clk_rate = 120000000,
.properties = spt_i2c_properties,
};
From 120000000 to 180000000, people are still working on getting
to the bottom of this but it is worth a shot. The clk_rate
value here is only used to calculate i2c timings and does
not actually program a clock, it only specifies the frequency
the clock is expected to be running at. So changing this should
be safe.
Ok, so I just read the new mails in the threads where this is being
discussed and it has been confirmed by Intel that for all Canon Lake
devices the correct clk_rate is 216000000 . Which likely explains
the i2c errors here. Jarkko (added to the Cc) is working on a patch
for this.
For now if you can build your own kernels you can make the change I
suggested above, but that will also change the clock-rate on other
machines, so that is just for testing on Canon Lake hardware!
The way the Interrupt is specified is still suspicious btw, but
we'll cross that bridge when we get there.
Regards,
Hans
--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html