On Tue, 9 Sep 2014, Joe Lawrence wrote: > Hello linux-usb, > > I've been testing surprise device hotlug removal with RHEL7 on Stratus > hardware (fully redundant PCI branches) and ran into a crashing NULL-ptr > bug during device initialization. The code looks the same upstream, so > I'm reporting it here. ... > Possible order of events and commentary: > > *** USB host controller inserted *** > > [ khubd ] [ hotplug driver ] > ... > hub_probe > hub_configure > > /* setup maxchild */ > > hdev->maxchild = > hub->descriptor->bNbrPorts; > > usb_get_status > ... > usb_start_wait_urb > zzzzzz... > > *** USB host controller is removed *** > > [ khubd ] [ hotplug driver ] > > pci_stop_bus_device > ... > usb_hcd_pci_remove > usb_hcd_irq > ehci_irq > usb_hc_died > usb_set_device_state > recursively_mark_NOTATTACHED > > /* iterate over all maxchild */ > /* assume ports[x] are allocated */ > > for (i = 0; i < udev->maxchild; ++i) { > if (hub->ports[i]->child) > > *** crash on hub->ports[i]-> dereference *** > > recursively_mark_NOTATTACHED(hub->ports[i]->child); > } > > > [ khubd ] > > /* port_dev would have been > allocated here */ > > usb_hub_create_port_device > port_dev = kzalloc(...) > hub->ports[port1 - 1] = port_dev > > > In summary, khubd has initialized the usb_device maxchild to 8 and > provided backing-store for the usb_hub ports[] array. However, before > it gets to fill in pointers for each port[] entry, the device is removed > and the hotplug driver issues a pci_stop_bus_device. This removal > percolates all the way down to ehci_irq and usb_hc_died. When that goes > to mark the device as not attached, it expects that the ports[] pointers > are valid... kaboom. > > I've been testing the following patch with the RHEL7 kernel with no > crashes. I'm not sure if it completely solves the issue at hand, or > simply covers up the crash. Comments welcome. This has been fixed in the mainline kernel by commit d8521afe3586, which is part of a large series involving port-power control for USB-3 ports. You may want to apply just the portion that is relevant to this problem, namely, have hub_configure store the maxchild value in a local variable and don't assign it to hdev->maxchild until the port devices have safely been created. > Regards, > > -- Joe > > -->8-- -->8-- > > From abb49e02e2f56ed1528198dfe242a9dd3041dc79 Mon Sep 17 00:00:00 2001 > From: Joe Lawrence <joe.lawrence@xxxxxxxxxxx> > Date: Fri, 5 Sep 2014 15:02:29 -0400 > Subject: [PATCH] usb: hub: protect recursively_mark_NOTATTACHED and > half-initialized device > > The recursively_mark_NOTATTACHED assumes that hub->ports[] contains valid > pointers. If a device was removed after udev->maxchild was set, but before > hub->ports[] was populated, recursively_mark_NOTATTACHED may try to reach > through an uninitialized pointer. Since the backing memory was kzalloc'd, > verify that the pointer is not-NULL before dereferencing. No, this is not a good fix. The assumption in recursively_mark_NOTATTACHED is valid, and the problem was in the code that violated the assumption. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html