Hi, On 09/28/2014 04:29 PM, Alan Stern wrote: > On Sat, 27 Sep 2014, Hans de Goede wrote: > >> This commit fixes the following oops: >> >> [10238.622067] scsi host3: uas_eh_bus_reset_handler start >> [10240.766164] usb 3-4: reset SuperSpeed USB device number 3 using xhci_hcd >> [10245.779365] usb 3-4: device descriptor read/8, error -110 >> [10245.883331] usb 3-4: reset SuperSpeed USB device number 3 using xhci_hcd >> [10250.897603] usb 3-4: device descriptor read/8, error -110 >> [10251.058200] BUG: unable to handle kernel NULL pointer dereference at 0000000000000040 >> [10251.058244] IP: [<ffffffff815ac6e1>] xhci_check_streams_endpoint+0x91/0x140 >> <snip> >> [10251.059473] Call Trace: >> [10251.059487] [<ffffffff815aca6c>] xhci_calculate_streams_and_bitmask+0xbc/0x130 >> [10251.059520] [<ffffffff815aeb5f>] xhci_alloc_streams+0x10f/0x5a0 >> [10251.059548] [<ffffffff810a4685>] ? check_preempt_curr+0x75/0xa0 >> [10251.059575] [<ffffffff810a46dc>] ? ttwu_do_wakeup+0x2c/0x100 >> [10251.059601] [<ffffffff810a49e6>] ? ttwu_do_activate.constprop.111+0x66/0x70 >> [10251.059635] [<ffffffff815779ab>] usb_alloc_streams+0xab/0xf0 >> [10251.059662] [<ffffffffc0616b48>] uas_configure_endpoints+0x128/0x150 [uas] >> [10251.059694] [<ffffffffc0616bac>] uas_post_reset+0x3c/0xb0 [uas] >> [10251.059722] [<ffffffff815727d9>] usb_reset_device+0x1b9/0x2a0 >> [10251.059749] [<ffffffffc0616f42>] uas_eh_bus_reset_handler+0xb2/0x190 [uas] >> [10251.059781] [<ffffffff81514293>] scsi_try_bus_reset+0x53/0x110 >> [10251.059808] [<ffffffff815163b7>] scsi_eh_bus_reset+0xf7/0x270 >> <snip> >> >> The problem is the following call sequence (simplified): >> >> 1) usb_reset_device >> 2) usb_reset_and_verify_device >> 2) hub_port_init >> 3) hub_port_finish_reset >> 3) xhci_discover_or_reset_device >> This frees xhci->devs[slot_id]->eps[ep_index].ring for all eps but 0 >> 4) usb_get_device_descriptor >> This fails >> 5) hub_port_init fails >> 6) usb_reset_and_verify_device fails, does not restore device config >> 7) uas_post_reset >> 8) xhci_alloc_streams >> NULL deref on the free-ed ring >> >> The real problem here is that xhci_discover_or_reset_device frees the rings >> during a reset, and if usb_reset_and_verify_device then fails to restore the >> device configuration (and thus re-alloc the rings), that we're then left with >> a device which from the usb_device_driver's pov is still functional (its >> disconnect method will get called when khub gets around to it), but in >> reality is not. >> >> This commit adds a check for this condition to xhci_check_args(), which should >> cover all public entry points into the xhci driver. > > Wouldn't it be better to make usbcore check that the device state is > CONFIGURED (or at least, isn't NOTATTACHED) before asking the HCD to > allocate or release any streams? That will not help, since khubd is the one changing the state when usb_reset_and_verify_device fails, and khubd may not yet have a chance to run when this happens. Also this can happen on normal urb submission after a (failed) reset too. Regards, Hans -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html