Re: Oops in current tree in i2c

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Linus,

On 27-10-18 18:08, Linus Torvalds wrote:
Julian, Jiri,
  On my laptop I'm getting a kernel page fault with the current git
tree, and I'm tentatively blaming commit

   9ee3e06610fd ("HID: i2c-hid: override HID descriptors for certain devices")

but that's simply because it's the only thing that seems to touch this
particular area in this merge window.

The oops looks like this:

   BUG: unable to handle kernel paging request at 000000007a25d598
   PGD 0 P4D 0
   Oops: 0000 [#1] SMP PTI
   CPU: 1 PID: 888 Comm: systemd-udevd Not tainted 4.19.0-07715-g345671ea0f92 #4
   Hardware name: Dell Inc. XPS 13 9350/09JHRY, BIOS 1.7.0 01/16/2018
   RIP: 0010:strstr+0x19/0x70

where the code disassembly (and the register contents) shows that the
wild pointer is the first argument to "strstr()", which just has a
bogus value that is not a valid kernel pointer (RDI: 000000007a25d598
- which is obviously also the address of the page fault)

The call trace is:

    dmi_matches+0x55/0xc0
    dmi_first_match+0x26/0x40
    i2c_hid_get_dmi_i2c_hid_desc_override+0x16/0x40 [i2c_hid]
    i2c_hid_probe+0x28c/0x760 [i2c_hid]
    i2c_device_probe+0x1e7/0x260
    really_probe+0xf8/0x3e0
    driver_probe_device+0x10f/0x120
    bus_for_each_drv+0x66/0xb0
    __device_attach+0xd9/0x150
    bus_probe_device+0x8a/0xa0
    device_add+0x48e/0x660
    i2c_new_device+0x162/0x350

which is why I suspect that new i2c_hid_get_dmi_hid_report_desc_override() code.

I *think* the problem is that the i2c_hid_dmi_desc_override_table[]
isn't terminated by a NULL entry, and I will test that next.

Yes that likely is the problem. I already had a bug report from one of the
Manjaro maintainers who was cherry picking this into the Manjaro kernel.

So I ran some tests on a laptop of mine which does use i2c-hid but I
failed to reproduce the issue, so we both (me and the Manjaro maintainer)
both assumed something went wrong with the backport.

Both of us seem to have overlooked the missing terminating entry,
as well as other people involved in the patch.

What makes me *very* unhappy about this is that if I'm right, I think
it means that code was literally not tested at all by anybody who
didn't have one of the entries in that list.

That is not true, I've hit one of these unterminated dmi lists
issues before and it depends on what get put in mem directly
after the list by the linker, bugs caused by this do not always
reproduce unfortunately.

And as mentioned I have tested the patch on a machine with an i2c-hid
touchpad, which is not on the list and I did not hit this problem.

Anyways this is fixed now, thank you for catching this.

Regards,

Hans



[Index of Archives]     [Linux Media Devel]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Linux Wireless Networking]     [Linux Omap]

  Powered by Linux