On 31.01.2019 12:42, Jan-Marek Glogowski wrote:
Am 30.01.19 um 17:56 schrieb Jan-Marek Glogowski:
Am 30.01.19 um 15:58 schrieb Mathias Nyman:
On 30.01.2019 14:38, Jan-Marek Glogowski wrote:
The hub->event bits are set when roothub polling timer function gets data from
hub_status_data(), writes the data to a status urb, and returns the urb.
Normal operation:
usb_hcd_poll_rh_status() // roothub polling timer function
hcd->driver->hub_status_data(hcd, buffer) -> xhci_hub_status_data()
xhci_hub_status_data() //returns buffer with bits set if a port needs attention
if (buffer && urb)
memcpy(urb->transfer_buffer, buffer) // copy buffer bits to urb
usb_hcd_giveback_urb(urb) -> hub_irq // calls urb complete callback function
hub_irq()
hub->event_bits = urb->transfer_buffer //
mod_timer(usb_hcd_poll_rh_status, future) // reschedule roothub polling timer
I'm guessing usb_hcd_poll_rh_status() never properly completes a URB in your case.
This could be possible if roothub is polled before properly resumed, then the URB wouldn't
be available yet, and no hub->event_bits are set.
Because no bits are set the hub starts suspending again after resume, probably before roothub
timer polls roothub again.
suspend fails when it manually checks hcd->driver->hub_status_data() for activity, and
it start resuming again, and continues this in a loop.
Either roothub polling is stopped at the suspend attempt, or maybe continuously rescheduled
into the future because as a part of suspend attempt/resume loop.
So as you expected the status_urb is always NULL.
I tried to understand more of the code by adding some more debug message.
I looked at usb_hcd_resume_root_hub, which is called when we see the "resume root hub" dmesg.
This just queues the hcd->wakeup_work, so indeed, if usb_hcd_poll_rh_status is later called,
the hub might not have resumed yet and polling starts. So my simple approach was to add a
function to call flush_work(&hcd->wakeup_work) before usb_hcd_poll_rh_status. All variants I
tested killed the kernel. I also tried to add a udelay to no avail.
I think the idea is that if usb_hcd_poll_rh_status() finds a port that needs attention, but
the is URB missing, then a HCD_FLAG_POLL_PENDING flag is set.
This flag is checked in resume when submitting the hub URB, and idea is to react on it:
hub_activate(HUB_RESUME)
usb_submit_urb(hub->urb)
rh_urb_enqueue()
rh_queue_status()
hcd->status_urb = urb;
if (HCD_POLL_PENDING(hcd)
mod_timer(&hcd->rh_timer, jiffies);
It tries to trigger usb_hcd_poll_rh_status() immediately by modifying the timer,
but from your logs it seems that it's not fast enough:
[ 485.301806] usbcore:hub_resume: hub 2-0:1.0: hub_resume
[ 485.301819] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: get port status, actual port 0 status = 0x2a0
...
[ 485.301936] usbcore:hub_event: hub 2-0:1.0: state 7 ports 6 chg 0000 evt 0000
[ 485.301952] xhci_hcd:xhci_hub_control: xhci_hcd 0000:00:14.0: set port remote wake mask, actual port 0 status = 0xe0002a0
...
[ 485.302034] usbcore:hub_suspend: hub 2-0:1.0: hub_suspend
[ 485.302048] usbcore:hcd_bus_suspend: usb usb2: bus auto-suspend, wakeup 1
[ 485.302067] usbcore:hcd_bus_suspend: usb usb2: suspend raced with wakeup event
[ 485.302071] usbcore:hcd_bus_resume: usb usb2: usb auto-resume
[ 485.309765] usb usb2: usb_hcd_poll_rh_status, length 1, hcd->flags 0x18d, status_urb NULL
hcd->flags 0x18d means HCD_FLAG_POLL_PENDING was set, but hub_event() workqueue and hub_suspend()
are run before polling the root hub, and I guess the URB gets removed at some point during suspsend.
This then goes on in a loop.
You could try running usb_hcd_poll_rh_status() direcly instead of just kicking the timer and see if
it works:
@@ -822,8 +826,11 @@ static int rh_queue_status (struct usb_hcd *hcd, struct urb *urb)
mod_timer(&hcd->rh_timer, (jiffies/(HZ/4) + 1) * (HZ/4));
/* If a status change has already occurred, report it ASAP */
- else if (HCD_POLL_PENDING(hcd))
- mod_timer(&hcd->rh_timer, jiffies);
+ else if (HCD_POLL_PENDING(hcd)) {
+ spin_unlock_irqrestore (&hcd_root_hub_lock, flags);
+ usb_hcd_poll_rh_status(hcd);
+ return 0;
+ }
retval = 0;
done:
spin_unlock_irqrestore (&hcd_root_hub_lock, flags);
I didn't check locking or running contexts if it's ok to do this, there's could be a reason
why it wasn't done like this in the first place.
Worth a shot still.
-Mathias