use after free of bos pointer in usb_reset_and_verify_device?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Alan, Changbin, Xenia,

I've twice encountered a crash on system reboot in usb_disable_device
that looks to be a bos descriptor use-after-free.

The machine in question is running a 4.5-rc5 kernel with proprietary and
out-of-tree Stratus drivers that facilitate device-removal testing --
these drivers give us the ability to automate/simulate device removal of
PCI devices at any time.  None of these affect any of the USB drivers or
stack directly, they mainly consist of platform-aware features, PCI
hotplug calls and a few fixups in storage/network land to better survive
the testing.

Additional debugging features were enabled: slub_debug=FZPU and the
following build options:

  CONFIG_DEBUG_OBJECTS=y
  CONFIG_DEBUG_OBJECTS_FREE=y
  CONFIG_DEBUG_OBJECTS_TIMERS=y
  CONFIG_DEBUG_OBJECTS_ENABLE_DEFAULT=1

I was able to scrape a dump from the crash, its backtrace looks like:

[ 4095.109442] usb 6-1: device descriptor read/8, error -22
[ 4095.240435] usb 6-1: device descriptor read/8, error -22

PID: 18692  TASK: ffff881034229640  CPU: 9   COMMAND: "kworker/9:0"
 #0 [ffff880a9907f9c8] machine_kexec at ffffffff8105c97b
 #1 [ffff880a9907fa28] __crash_kexec at ffffffff811097f2
 #2 [ffff880a9907faf8] crash_kexec at ffffffff811098df
 #3 [ffff880a9907fb10] oops_end at ffffffff8101ba89
 #4 [ffff880a9907fb38] die at ffffffff8101bfab
 #5 [ffff880a9907fb68] do_general_protection at ffffffff81019289
 #6 [ffff880a9907fb90] general_protection at ffffffff81686948
    [exception RIP: usb_disable_ltm+0x4c]
    RIP: ffffffff8149210c  RSP: ffff880a9907fc48  RFLAGS: 00010202
    RAX: 0000000000000000  RBX: ffff88103756dcd0  RCX: 0000000100300014
    RDX: 6b6b6b6b6b6b6b6b  RSI: ffffea0040c3ac00  RDI: ffff88103756dcd0
    RBP: ffff880a9907fc80   R8: ffff881030eb1110   R9: 0000000000000000
    R10: ffff881030eb2df0  R11: 0000000000000000  R12: 0000000000000000
    R13: ffff88103617c290  R14: 0000000000000001  R15: 0000000000000012
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #7 [ffff880a9907fc48] usb_disable_device at ffffffff8149f0fa
 #8 [ffff880a9907fc88] usb_disconnect at ffffffff81494c74
 #9 [ffff880a9907fcd8] hub_port_connect at ffffffff81496790
#10 [ffff880a9907fd58] hub_event at ffffffff8149767c
#11 [ffff880a9907fe18] process_one_work at ffffffff8109862f
#12 [ffff880a9907fe60] worker_thread at ffffffff81098f15
#13 [ffff880a9907fec8] kthread at ffffffff8109ea88
#14 [ffff880a9907ff50] ret_from_fork at ffffffff81684a0f

The crash occurred when dereferencing RDX, which is set to the
slub_debug poison pattern.  The code in question:

  crash> dis -l usb_disable_ltm+0x4c
  drivers/usb/core/hub.c: 2976
    <usb_disable_ltm+0x4c>:      testb  $0x2,0x3(%rdx)

  drivers/usb/core/hub.c :: usb_disable_ltm :

  2973         struct usb_hcd *hcd = bus_to_hcd(udev->bus);
  2974
  2975         /* Check if the roothub and device supports LTM. */
  2976         if (!usb_device_supports_ltm(hcd->self.root_hub) ||
  2977                         !usb_device_supports_ltm(udev))

  include/linux/usb.h :: usb_device_supports_ltm :

  static inline bool usb_device_supports_ltm(struct usb_device *udev)
  {
          if (udev->speed != USB_SPEED_SUPER || !udev->bos || !udev->bos->ss_cap)
                  return false;
          return udev->bos->ss_cap->bmAttributes & USB_LTM_SUPPORT;
  }

Running through a few of the data structure offsets, it appears that
we're in usb_device_supports_ltm and referencing off udev->bos->ss_cap:

  crash> struct -o usb_ss_cap_descriptor | grep bmAttributes
    [0x3] __u8 bmAttributes;

>From the disassembly of usb_device_supports_ltm, RDI looks untouched, so
it's probably a struct usb_device.  It's kobj name is "6-1", which
matches the usb device that had just reported an error in the kernel
log.

  crash> struct usb_device ffff88103756dcd0 | grep name
      name = 0xffff88102aab8540 "6-1",

Now to determine which invocation of usb_device_supports_ltm is bad.
Chasing hcd->self.root_hub ...

  crash> struct usb_device ffff88103756dcd0 | grep bus
    bus = 0xffff88103617c290,
  crash> struct -o usb_hcd | grep bus
      [0x0] struct usb_bus self;

  crash> struct usb_hcd 0xffff88103617c290 | grep root_hub
      root_hub = 0xffff881034ad0000,
  crash> struct usb_device 0xffff881034ad0000 | grep bos
    bos = 0xffff881012d37a30,
  crash> struct usb_host_bos 0xffff881012d37a30 | grep ss_cap
    ss_cap = 0xffff88101af89d95,
  crash> struct usb_ss_cap_descriptor 0xffff88101af89d95 | grep bmAttributes
    bmAttributes = 0x2,

That looks reasonable.
How about udev ...

  crash> struct usb_device 0xffff88103756dcd0 | grep bos
    bos = 0xffff8810343a61b0,
  crash> struct usb_host_bos 0xffff8810343a61b0 | grep ss_cap
    ss_cap = 0x6b6b6b6b6b6b6b6b,

Not so good.  Crash utility confirms that the bos pointer is not
currently allocated:

  crash> kmem 0xffff8810343a61b0
  CACHE            NAME                 OBJSIZE  ALLOCATED     TOTAL  SLABS  SSIZE
  ffff88103f40f800 kmalloc-64                64      80423     87781   2141    16k
    SLAB              MEMORY            NODE  TOTAL  ALLOCATED  FREE
    ffffea0040d0e900  ffff8810343a4000     0     41         11    30
    FREE / [ALLOCATED]
     ffff8810343a61b0

        PAGE         PHYSICAL      MAPPING       INDEX CNT FLAGS
  ffffea0040d0e980 10343a6000 dead000000000400        0  0 2fffff80000000

This is odd since usb_release_bos_descriptor not only frees bos, but
also sets to NULL.

Before crash, two errors were logged:

  usb 6-1: device descriptor read/8, error -22

which can only happen in hub_port_init.  One of hub_port_init's callers
is usb_reset_and_verify_device, which interestingly enough can reload
udev->bos like so:

  usb_reset_and_verify_device

    bos = udev->bos                                        [A]

    for (i = 0; i < SET_CONFIG_TRIES; ++i)
      ret = hub_port_init
        "usb 6-1: device descriptor read/8, error -22"     [B]
    if (ret < 0)
      goto re_enumerate;

    re_enumerate:
      usb_release_bos_descriptor(udev)
        kfree(dev->bos)                                    [C]
        dev->bos = NULL
      udev->bos = bos                                      [D]
    re_enumerate_no_bos:
      hub_port_logical_disconnect(parent_hub, port1)
      return -ENODEV

A - cache the old udev->bos
B - fail to set a new udev->bos
C - release udev->bos, set udev->bos to NULL
D - set udev->bos to the cached value from A

I'm following a hunch here, but I suspect that reboot might have
stumbled into an issue inside hub_port_init and then
usb_reset_and_verify_device accidentally restored udev->bos to a now
freed pointer value.

There's a bit of git history around usb_reset_and_verify_device and the
bos pointer in particular:

  e3376d6c87ee usbcore: compare and release one bos descriptor in usb_reset_and_verify_device()
    Author: Xenia Ragiadakou <burzalodowa@xxxxxxxxx>
    Suggested-by: Alan Stern <stern@xxxxxxxxxxxxxxxxxxx>
    Date:   Thu Aug 29 21:45:48 2013 +0300
  
    (Fixed a memory leak when the bos descriptor changes.  Releases the
    old udev->bos and uses the new one.)
  
  d8f00cd685f5c usb: hub: do not clear BOS field during reset device
    Author: Du, Changbin <changbin.du@xxxxxxxxx>
    Date:   Mon Jan 18 21:02:42 2016 +0800
  
    (In usb_reset_and_verify_device, removed the clearing of udev->bos
    to NULL (as introduced in e3376d6c87ee) and modified the function's
    done: path to check to see if the bos descriptor has changed, and if
    so, release the old udev->bos and use the new one.)

I'm not a USB expert, so I don't know if Changbin's patch can be
extended to include the function's error paths as well, or if
usb_reset_and_verify_device can be fixed simply by jumping to
re_enumerate_no_bos in the case(s) that hub_port_init doesn't allocate a
new one.

Hopefully the analysis is correct and can help towards a patch.  Let me
know if any additional information from the crash or additional tracing
would be helpful.

Thanks,

-- Joe
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux