xhci list corruption on sysfs removal

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Roger and Mathias,

Running with slub_debug=FZPU and removing an XHCI host controller via
sysfs, I've hit a use-after-free that I've bisected to:

  8c24d6d7b09deee3036ddc4f2b81b53b28c8f877 is the first bad commit
  commit 8c24d6d7b09deee3036ddc4f2b81b53b28c8f877
  Author: Roger Quadros <rogerq@xxxxxx>
  Date:   Mon Sep 21 17:46:14 2015 +0300

      usb: xhci: stop everything on the first call to xhci_stop

      xhci_stop will be called twice, once for the shared hcd
      and again for the primary hcd.

      We stop the XHCI controller in any case so clean up
      everything on the first call else we can timeout
      waiting for pending requests to complete.

      Cc: <stable@xxxxxxxxxxxxxxx>
      Signed-off-by: Roger Quadros <rogerq@xxxxxx>
      Signed-off-by: Mathias Nyman <mathias.nyman@xxxxxxxxxxxxxxx>
      Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>

I can repo the following list_del corruption warning every time, simply
by removing the device:

  % lspci -D | grep -i xhci
  0000:65:14.0 USB controller: Intel Corporation C610/X99 series chipset USB xHCI Host Controller (rev 05)

  % echo 1 > $(find /sys/devices -name '0000:65:14.0')/remove

------------[ cut here ]------------
WARNING: CPU: 22 PID: 13964 at lib/list_debug.c:59 __list_del_entry+0xa1/0xd0()
list_del corruption. prev->next should be ffff881032144350, but was 6b6b6b6b6b6b6b6b
[ ... modules snip ... ]
CPU: 22 PID: 13964 Comm: bash Not tainted 4.4.0-rc5+ #27
Hardware name: Stratus ftServer 6800/G7LYY, BIOS BIOS Version 8.1:61 09/10/2015
 0000000000000000 00000000dfa07299 ffff88103091b898 ffffffff8131d770
 ffff88103091b8e0 ffff88103091b8d0 ffffffff8107ef56 ffff88103205d2d0
 ffff8810320c2698 ffff881032144350 0000000000000000 0000000000000204
Call Trace:
 [<ffffffff8131d770>] dump_stack+0x44/0x64
 [<ffffffff8107ef56>] warn_slowpath_common+0x86/0xc0
 [<ffffffff8107efec>] warn_slowpath_fmt+0x5c/0x80
 [<ffffffff811dac6c>] ? __slab_free+0x1bc/0x240
 [<ffffffff81339511>] __list_del_entry+0xa1/0xd0
 [<ffffffff814b3a49>] xhci_urb_dequeue+0xd9/0x380
 [<ffffffff8147fb7d>] unlink1+0x2d/0x110
 [<ffffffff81481d75>] usb_hcd_flush_endpoint+0xf5/0x190
 [<ffffffff81484c79>] usb_disable_endpoint+0x59/0x90
 [<ffffffff81484cf5>] usb_disable_interface+0x45/0x60
 [<ffffffff81487558>] usb_unbind_interface+0x1b8/0x260
 [<ffffffff81444176>] __device_release_driver+0x96/0x130
 [<ffffffff81444233>] device_release_driver+0x23/0x30
 [<ffffffff81442fa1>] bus_remove_device+0x101/0x170
 [<ffffffff8143f3a9>] device_del+0x139/0x260
 [<ffffffff8148bc3f>] ? usb_remove_ep_devs+0x1f/0x30
 [<ffffffff81484db6>] usb_disable_device+0xa6/0x280
 [<ffffffff8147a9f4>] usb_disconnect+0x94/0x270
 [<ffffffff8147ab54>] usb_disconnect+0x1f4/0x270
 [<ffffffff8147fd32>] usb_remove_hcd+0xd2/0x240
 [<ffffffff81491f0f>] usb_hcd_pci_remove+0x6f/0x140
 [<ffffffff814c6e9e>] xhci_pci_remove+0x4e/0x70
 [<ffffffff8135be99>] pci_device_remove+0x39/0xc0
 [<ffffffff81444176>] __device_release_driver+0x96/0x130
 [<ffffffff81444233>] device_release_driver+0x23/0x30
 [<ffffffff813549fc>] pci_stop_bus_device+0x8c/0xa0
 [<ffffffff81354b1a>] pci_stop_and_remove_bus_device_locked+0x1a/0x30
 [<ffffffff8135d9fc>] remove_store+0x7c/0x90
 [<ffffffff8143e5c8>] dev_attr_store+0x18/0x30
 [<ffffffff81275e3a>] sysfs_kf_write+0x3a/0x50
 [<ffffffff812754c0>] kernfs_fop_write+0x120/0x170
 [<ffffffff811f9d67>] __vfs_write+0x37/0x100
 [<ffffffff812ab343>] ? selinux_file_permission+0xc3/0x110
 [<ffffffff812a2e9d>] ? security_file_permission+0x3d/0xc0
 [<ffffffff810c65bf>] ? percpu_down_read+0x1f/0x50
 [<ffffffff811fa442>] vfs_write+0xa2/0x1a0
 [<ffffffff81003176>] ? do_audit_syscall_entry+0x66/0x70
 [<ffffffff811fb205>] SyS_write+0x55/0xc0
 [<ffffffff81666dee>] entry_SYSCALL_64_fastpath+0x12/0x71
---[ end trace 02b6650c4e01b29e ]---

I added some instrumentation to xhci_urb_dequeue:

-                       if (!list_empty(&td->td_list))
-                               list_del_init(&td->td_list);
+                       if (!list_empty(&td->td_list)) {
+                               pr_err("%s(%p, %p, ...) list_del_init(%p)\n next=%p(n=%p, p=%p) prev=%p(n=%p, p=%p)\n",
+                                       __func__, hcd, urb, &td->td_list,
+                                       td->td_list.prev, td->td_list.prev->prev, td->td_list.prev->next,
+                                       td->td_list.next, td->td_list.next->prev, td->td_list.next->next);
+                       }

to prove the list corruption complaint is from the td_list:

  xhci_hcd 0000:65:14.0: remove, state 4
  usb usb4: USB disconnect, device number 1
  xhci_hcd 0000:65:14.0: USB bus 4 deregistered
  xhci_hcd 0000:65:14.0: remove, state 1
  usb usb3: USB disconnect, device number 1
  usb 3-1: USB disconnect, device number 2
  xhci_urb_dequeue(ffff8810365b0000, ffff882032987cf0, ...) list_del_init(ffff882032885588) next=ffff881037742b00(n=6b6b6b6b6b6b6b6b, p=6b6b6b6b6b6b6b6b) prev=ffff881037742b00(n=6b6b6b6b6b6b6b6b, p=6b6b6b6b6b6b6b6b)

If I revert 8c24d6d7b09d "usb: xhci: stop everything on the first call
to xhci_stop", the warning goes away.

Let me know if any additional instrumentation or information would help
track down this issue.

Thanks,

-- Joe
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux