Am 31.01.19 um 15:56 schrieb Mathias Nyman: > On 31.01.2019 12:42, Jan-Marek Glogowski wrote: >> Am 30.01.19 um 17:56 schrieb Jan-Marek Glogowski: >>> Am 30.01.19 um 15:58 schrieb Mathias Nyman: >>>> On 30.01.2019 14:38, Jan-Marek Glogowski wrote: >>>> >>>> The hub->event bits are set when roothub polling timer function gets data from >>>> hub_status_data(), writes the data to a status urb, and returns the urb. >>>> >>>> Normal operation: >>>> usb_hcd_poll_rh_status() // roothub polling timer function >>>> hcd->driver->hub_status_data(hcd, buffer) -> xhci_hub_status_data() >>>> xhci_hub_status_data() //returns buffer with bits set if a port needs attention >>>> if (buffer && urb) >>>> memcpy(urb->transfer_buffer, buffer) // copy buffer bits to urb >>>> usb_hcd_giveback_urb(urb) -> hub_irq // calls urb complete callback function >>>> hub_irq() >>>> hub->event_bits = urb->transfer_buffer // >>>> mod_timer(usb_hcd_poll_rh_status, future) // reschedule roothub polling timer >>>> >>>> I'm guessing usb_hcd_poll_rh_status() never properly completes a URB in your case. >>>> >>>> This could be possible if roothub is polled before properly resumed, then the URB wouldn't >>>> be available yet, and no hub->event_bits are set. >>>> Because no bits are set the hub starts suspending again after resume, probably before roothub >>>> timer polls roothub again. >>>> suspend fails when it manually checks hcd->driver->hub_status_data() for activity, and >>>> it start resuming again, and continues this in a loop. >>>> >>>> Either roothub polling is stopped at the suspend attempt, or maybe continuously rescheduled >>>> into the future because as a part of suspend attempt/resume loop. >> >> So as you expected the status_urb is always NULL. >> >> I tried to understand more of the code by adding some more debug message. >> >> I looked at usb_hcd_resume_root_hub, which is called when we see the "resume root hub" dmesg. >> This just queues the hcd->wakeup_work, so indeed, if usb_hcd_poll_rh_status is later called, >> the hub might not have resumed yet and polling starts. So my simple approach was to add a >> function to call flush_work(&hcd->wakeup_work) before usb_hcd_poll_rh_status. All variants I >> tested killed the kernel. I also tried to add a udelay to no avail. > > I think the idea is that if usb_hcd_poll_rh_status() finds a port that needs attention, but > the is URB missing, then a HCD_FLAG_POLL_PENDING flag is set. > > This flag is checked in resume when submitting the hub URB, and idea is to react on it: > > hub_activate(HUB_RESUME) > usb_submit_urb(hub->urb) > rh_urb_enqueue() > rh_queue_status() > hcd->status_urb = urb; > if (HCD_POLL_PENDING(hcd) > mod_timer(&hcd->rh_timer, jiffies); > > It tries to trigger usb_hcd_poll_rh_status() immediately by modifying the timer, > but from your logs it seems that it's not fast enough: > > [ 485.301806] usbcore:hub_resume: hub 2-0:1.0: hub_resume > [ 485.301819] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: get port status, actual port 0 status = 0x2a0 > ... > [ 485.301936] usbcore:hub_event: hub 2-0:1.0: state 7 ports 6 chg 0000 evt 0000 > [ 485.301952] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: set port remote wake mask, actual port 0 status = 0xe0002a0 > ... > [ 485.302034] usbcore:hub_suspend: hub 2-0:1.0: hub_suspend > [ 485.302048] usbcore:hcd_bus_suspend: usb usb2: bus auto-suspend, wakeup 1 > [ 485.302067] usbcore:hcd_bus_suspend: usb usb2: suspend raced with wakeup event > [ 485.302071] usbcore:hcd_bus_resume: usb usb2: usb auto-resume > [ 485.309765] usb usb2: usb_hcd_poll_rh_status, length 1, hcd->flags 0x18d, status_urb NULL > > hcd->flags 0x18d means HCD_FLAG_POLL_PENDING was set, but hub_event() workqueue and hub_suspend() > are run before polling the root hub, and I guess the URB gets removed at some point during suspsend. > > This then goes on in a loop. > > You could try running usb_hcd_poll_rh_status() direcly instead of just kicking the timer and see if > it works: > > @@ -822,8 +826,11 @@ static int rh_queue_status (struct usb_hcd *hcd, struct urb *urb) > mod_timer(&hcd->rh_timer, (jiffies/(HZ/4) + 1) * (HZ/4)); > > /* If a status change has already occurred, report it ASAP */ > - else if (HCD_POLL_PENDING(hcd)) > - mod_timer(&hcd->rh_timer, jiffies); > + else if (HCD_POLL_PENDING(hcd)) { > + spin_unlock_irqrestore (&hcd_root_hub_lock, flags); > + usb_hcd_poll_rh_status(hcd); > + return 0; > + } > retval = 0; > done: > spin_unlock_irqrestore (&hcd_root_hub_lock, flags); > > > I didn't check locking or running contexts if it's ok to do this, there's could be a reason > why it wasn't done like this in the first place. > Worth a shot still. Yup - that also works. That still has the first status_urb NULL and one resume -> suspend race. I just kept your dev_err in addition. [18818.679289] xhci_hcd:handle_port_status: xhci_hcd 0000:00:14.0: Port Status Change Event for port 20 [18818.679299] xhci_hcd:handle_port_status: xhci_hcd 0000:00:14.0: resume root hub [18818.679314] xhci_hcd:handle_port_status: xhci_hcd 0000:00:14.0: handle_port_status: starting port polling. [18818.679329] usb usb2: usb_hcd_poll_rh_status, length 1, hcd->flags 0x195, status_urb NULL [18818.679387] usbcore:usb_remote_wakeup: usb usb2: usb wakeup-resume [18818.679395] usbcore:hcd_bus_resume: usb usb2: usb auto-resume [18818.679416] usbcore:hub_resume: hub 2-0:1.0: hub_resume [18818.679428] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: get port status, actual port 0 status = 0x2a0 [18818.679439] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: Get port status returned 0x2a0 [18818.679476] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: get port status, actual port 1 status = 0x2a0 [18818.679485] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: Get port status returned 0x2a0 [18818.679499] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: get port status, actual port 2 status = 0x2a0 [18818.679503] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: Get port status returned 0x2a0 [18818.679518] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: get port status, actual port 3 status = 0x4002c0 [18818.679522] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: Get port status returned 0x4002c0 [18818.679538] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: get port status, actual port 4 status = 0x2a0 [18818.679546] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: Get port status returned 0x2a0 [18818.679561] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: get port status, actual port 5 status = 0x2a0 [18818.679565] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: Get port status returned 0x2a0 [18818.679586] usb usb2: usb_hcd_poll_rh_status, length 1, hcd->flags 0x1ad, status_urb Exists [18818.679602] usbcore:hub_event: hub 2-0:1.0: state 7 ports 6 chg 0000 evt 0000 [18818.679619] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: set port remote wake mask, actual port 0 status = 0xe0002a0 [18818.679645] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: set port remote wake mask, actual port 1 status = 0xe0002a0 [18818.679659] usbcore:hub_event: hub 2-0:1.0: state 7 ports 6 chg 0000 evt 0010 [18818.679677] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: set port remote wake mask, actual port 2 status = 0xe0002a0 [18818.679695] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: set port remote wake mask, actual port 3 status = 0xe4002c0 [18818.679712] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: set port remote wake mask, actual port 4 status = 0xe0002a0 [18818.679729] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: set port remote wake mask, actual port 5 status = 0xe0002a0 [18818.679744] usbcore:hub_suspend: hub 2-0:1.0: hub_suspend [18818.679762] usbcore:hcd_bus_suspend: usb usb2: bus auto-suspend, wakeup 1 [18818.679784] usbcore:hcd_bus_suspend: usb usb2: suspend raced with wakeup event [18818.679789] usbcore:hcd_bus_resume: usb usb2: usb auto-resume [18818.698146] usbcore:hub_resume: hub 2-0:1.0: hub_resume [18818.698159] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: get port status, actual port 0 status = 0x2a0 [18818.698164] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: Get port status returned 0x2a0 [18818.698188] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: get port status, actual port 1 status = 0x2a0 [18818.698192] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: Get port status returned 0x2a0 [18818.698206] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: get port status, actual port 2 status = 0x2a0 [18818.698210] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: Get port status returned 0x2a0 [18818.698224] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: get port status, actual port 3 status = 0x4002c0 [18818.698229] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: Get port status returned 0x4002c0 [18818.698243] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: get port status, actual port 4 status = 0x2a0 [18818.698247] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: Get port status returned 0x2a0 [18818.698263] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: get port status, actual port 5 status = 0x2a0 [18818.698268] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: Get port status returned 0x2a0 [18818.698294] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: get port status, actual port 3 status = 0x4002c0 [18818.698298] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: Get port status returned 0x4002c0 [18818.698310] usbcore:port_event: usb usb2-port4: link state change [18818.698320] xhci_hcd:xhci_clear_port_change_bit: xhci_hcd 0000:00:14.0: clear port link state change, actual port 3 status = 0x2c0 [18818.698330] usbcore:port_event: usb usb2-port4: do warm reset [18818.758189] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: get port status, actual port 3 status = 0x2b0 [18818.758195] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: Get port status returned 0x2b0 [18818.758222] usbcore:hub_port_wait_reset: usb usb2-port4: not warm reset yet, waiting 50ms [18818.762303] xhci_hcd:xhci_hub_status_data: xhci_hcd 0000:00:14.0: xhci_hub_status_data: stopping port polling. [18818.762312] usb usb2: usb_hcd_poll_rh_status, length 0, hcd->flags 0x1a1, status_urb Exists [18818.804070] xhci_hcd:handle_port_status: xhci_hcd 0000:00:14.0: Port Status Change Event for port 20 [18818.804080] xhci_hcd:handle_port_status: xhci_hcd 0000:00:14.0: handle_port_status: starting port polling. [18818.804095] usb usb2: usb_hcd_poll_rh_status, length 1, hcd->flags 0x1a5, status_urb Exists [18818.818339] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: get port status, actual port 3 status = 0x2a1203 [18818.818345] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: Get port status returned 0x310203 [18818.818387] xhci_hcd:xhci_clear_port_change_bit: xhci_hcd 0000:00:14.0: clear port reset change, actual port 3 status = 0xa1203 [18818.818404] xhci_hcd:xhci_clear_port_change_bit: xhci_hcd 0000:00:14.0: clear port warm(BH) reset change, actual port 3 status = 0x21203 [18818.818422] xhci_hcd:xhci_clear_port_change_bit: xhci_hcd 0000:00:14.0: clear port link state change, actual port 3 status = 0x21203 [18818.818435] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: get port status, actual port 3 status = 0x21203 [18818.818439] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: Get port status returned 0x10203 [18818.878337] usbcore:hub_event: hub 2-0:1.0: state 7 ports 6 chg 0000 evt 0010 [18818.878355] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: get port status, actual port 3 status = 0x21203 [18818.878360] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: Get port status returned 0x10203 [18818.878392] xhci_hcd:xhci_clear_port_change_bit: xhci_hcd 0000:00:14.0: clear port connect change, actual port 3 status = 0x1203 [18818.878406] usbcore:hub_port_connect_change: usb usb2-port4: status 0203, change 0001, 10.0 Gb/s [18818.878414] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: get port status, actual port 3 status = 0x1203 [18818.878418] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: Get port status returned 0x203 [18818.914324] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: get port status, actual port 3 status = 0x1203 [18818.914330] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: Get port status returned 0x203 [18818.950175] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: get port status, actual port 3 status = 0x1203 [18818.950181] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: Get port status returned 0x203 [18818.986348] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: get port status, actual port 3 status = 0x1203 [18818.986354] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: Get port status returned 0x203 [18819.010300] xhci_hcd:xhci_hub_status_data: xhci_hcd 0000:00:14.0: xhci_hub_status_data: stopping port polling. [18819.010309] usb usb2: usb_hcd_poll_rh_status, length 0, hcd->flags 0x1a1, status_urb Exists [18819.022316] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: get port status, actual port 3 status = 0x1203 [18819.022322] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: Get port status returned 0x203 [18819.022349] usbcore:hub_port_debounce: usb usb2-port4: debounce total 100ms stable 100ms status 0x203 [18819.022360] xhci_hcd:xhci_ring_cmd_db: xhci_hcd 0000:00:14.0: // Ding dong! [18819.022457] xhci_hcd:xhci_alloc_virt_device: xhci_hcd 0000:00:14.0: Slot 1 output ctx = 0x44ea02000 (dma) [18819.022463] xhci_hcd:xhci_alloc_virt_device: xhci_hcd 0000:00:14.0: Slot 1 input ctx = 0x458568000 (dma) [18819.022475] xhci_hcd:xhci_alloc_virt_device: xhci_hcd 0000:00:14.0: Set slot id 1 dcbaa entry 0000000095bf5995 to 0x44ea02000 [18819.022522] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: get port status, actual port 3 status = 0x1203 [18819.022526] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: Get port status returned 0x203 [18819.022546] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: set port reset, actual port 3 status = 0x1311 [18819.022554] xhci_hcd:handle_port_status: xhci_hcd 0000:00:14.0: Port Status Change Event for port 20 [18819.022559] xhci_hcd:handle_port_status: xhci_hcd 0000:00:14.0: handle_port_status: starting port polling. [18819.022572] usb usb2: usb_hcd_poll_rh_status, length 0, hcd->flags 0x1a5, status_urb Exists [18819.090325] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: get port status, actual port 3 status = 0x201203 [18819.090331] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: Get port status returned 0x100203 [18819.090366] xhci_hcd:xhci_clear_port_change_bit: xhci_hcd 0000:00:14.0: clear port reset change, actual port 3 status = 0x1203 [18819.090382] xhci_hcd:xhci_clear_port_change_bit: xhci_hcd 0000:00:14.0: clear port warm(BH) reset change, actual port 3 status = 0x1203 [18819.090401] xhci_hcd:xhci_clear_port_change_bit: xhci_hcd 0000:00:14.0: clear port link state change, actual port 3 status = 0x1203 [18819.090415] xhci_hcd:xhci_clear_port_change_bit: xhci_hcd 0000:00:14.0: clear port connect change, actual port 3 status = 0x1203 [18819.090430] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: get port status, actual port 3 status = 0x1203 [18819.090434] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: Get port status returned 0x203 [18819.150399] xhci_hcd:xhci_setup_addressable_virt_dev: xhci_hcd 0000:00:14.0: Set root hub portnum to 20 [18819.150405] xhci_hcd:xhci_setup_addressable_virt_dev: xhci_hcd 0000:00:14.0: Set fake root hub portnum to 4 [18819.150410] xhci_hcd:xhci_setup_addressable_virt_dev: xhci_hcd 0000:00:14.0: udev->tt = (null) [18819.150415] xhci_hcd:xhci_setup_addressable_virt_dev: xhci_hcd 0000:00:14.0: udev->ttport = 0x0 [18819.150422] xhci_hcd:xhci_ring_cmd_db: xhci_hcd 0000:00:14.0: // Ding dong! [18819.150479] xhci_hcd:xhci_dbg_trace: xhci_hcd 0000:00:14.0: Successful setup address command [18819.150488] xhci_hcd:xhci_dbg_trace: xhci_hcd 0000:00:14.0: Op regs DCBAA ptr = 0x00000457c1b000 [18819.150495] xhci_hcd:xhci_dbg_trace: xhci_hcd 0000:00:14.0: Slot ID 1 dcbaa entry @0000000095bf5995 = 0x0000044ea02000 [18819.150500] xhci_hcd:xhci_dbg_trace: xhci_hcd 0000:00:14.0: Output Context DMA address = 0x44ea02000 [18819.150505] xhci_hcd:xhci_dbg_trace: xhci_hcd 0000:00:14.0: Internal device address = 0 [18819.150514] usb 2-4: new SuperSpeed Gen 1 USB device number 2 using xhci_hcd [18819.170820] usbcore:usb_parse_endpoint: usb 2-4: skipped 1 descriptor after endpoint [18819.170826] usbcore:usb_parse_endpoint: usb 2-4: skipped 1 descriptor after endpoint [18819.170858] xhci_hcd:process_ctrl_td: xhci_hcd 0000:00:14.0: Waiting for status stage event [18819.170904] usbcore:usb_get_langid: usb 2-4: default language 0x0409 [18819.170940] xhci_hcd:process_ctrl_td: xhci_hcd 0000:00:14.0: Waiting for status stage event [18819.171018] xhci_hcd:process_ctrl_td: xhci_hcd 0000:00:14.0: Waiting for status stage event [18819.171098] xhci_hcd:process_ctrl_td: xhci_hcd 0000:00:14.0: Waiting for status stage event [18819.171147] usbcore:usb_new_device: usb 2-4: udev 2, busnum 2, minor = 129 [18819.171154] usb 2-4: New USB device found, idVendor=0781, idProduct=5596, bcdDevice= 1.00 [18819.171159] usb 2-4: New USB device strings: Mfr=1, Product=2, SerialNumber=3 [18819.171163] usb 2-4: Product: Ultra T C [18819.171166] usb 2-4: Manufacturer: SanDisk [18819.171170] usb 2-4: SerialNumber: 4C530001090830111403 [18819.171397] usbcore:usb_probe_device: usb 2-4: usb_probe_device .... So is your patch a general solution? Can we get this as a bug fix into the kernel? Jan-Marek