On Tue, Sep 13, 2022, Elson Serrao wrote: > > > On 8/25/2022 6:30 PM, Thinh Nguyen wrote: > > On Tue, Aug 23, 2022, Elson Serrao wrote: > > > On 8/22/2022 6:01 PM, Thinh Nguyen wrote: > > > > On Thu, Aug 18, 2022 at 11:17:24AM -0700, Elson Serrao wrote: > > > > > > > > > > > > > > > On 8/16/2022 4:51 PM, Thinh Nguyen wrote: > > > > > > On 8/16/2022, Elson Serrao wrote: > > > > > > > > > > > > > > > > > > > > > On 8/12/2022 5:46 PM, Thinh Nguyen wrote: > > > > > > > > On 8/11/2022, Thinh Nguyen wrote: > > > > > > > > > On 8/11/2022, Thinh Nguyen wrote: > > > > > > > > > > On 8/11/2022, Elson Serrao wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On 8/9/2022 6:08 PM, Thinh Nguyen wrote: > > > > > > > > > > > > > > > > > > > > <snip> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > To summarize the points: > > > > > > > > > > > > > > > > > > > > > > > > 1) The host only arms function remote wakeup if the device is > > > > > > > > > > > > capable of > > > > > > > > > > > > remote wakeup (check USB_CONFIG_ATT_WAKEUP in bmAttributes and > > > > > > > > > > > > hardware > > > > > > > > > > > > capability) > > > > > > > > > > > > > > > > > > > > > > > > 2) If the device is in suspend, the device can do remote wakeup > > > > > > > > > > > > (through > > > > > > > > > > > > LFPS handshake) if the function is armed for remote wakeup (through > > > > > > > > > > > > SET_FEATURE(FUNC_SUSPEND)). > > > > > > > > > > > > > > > > > > > > > > > > 3) If the link transitions to U0 after the device triggering a remote > > > > > > > > > > > > wakeup, the device will also send device notification function > > > > > > > > > > > > wake for > > > > > > > > > > > > all the interfaces armed with remote wakeup. > > > > > > > > > > > > > > > > > > > > > > > > 4) If the device is not in suspend, the device can send device > > > > > > > > > > > > notification function wake if it's in U0. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Now, remote wakeup and function wake device notification are 2 > > > > > > > > > > > > separate > > > > > > > > > > > > operations. We have the usb_gadget_ops->wakeup() for remote wakeup. I > > > > > > > > > > > > suggested to maybe add > > > > > > > > > > > > usb_gadget_ops->send_wakeup_notification(gadget, > > > > > > > > > > > > intf_id) for the device notification. What you did was combining both > > > > > > > > > > > > operations in usb_gadget_ops->func_wakeup(). That may only work for > > > > > > > > > > > > point 4) (assuming you fix the U0 check), but not point 3). > > > > > > > > > > > > > > > > > > > > > > Thank you for your feedback and summary. I will rename func_wakeup to > > > > > > > > > > > send_wakeup_notification to better align with the approach. The > > > > > > > > > > > reason I > > > > > > > > > > > have combined remote_wakeup and function wake notification in > > > > > > > > > > > usb_gadget_ops->func_wakeup() is because since the implementation > > > > > > > > > > > is at > > > > > > > > > > > function/composite level it has no knowledge on the link state. So I > > > > > > > > > > > have delegated that task to controller driver to handle the > > > > > > > > > > > notification > > > > > > > > > > > accordingly. That is do a LFPS handshake first if the device is > > > > > > > > > > > suspended and then send notification (explained below). But we can > > > > > > > > > > > definitely separate this by adding an additional flag in the composite > > > > > > > > > > > layer to set the link state based on the gadget suspend callback > > > > > > > > > > > called > > > > > > > > > > > when U3 SUSPEND interrupt is received. Let me know if you feel > > > > > > > > > > > separating the two is a better approach. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The reason I think we need to separate it is because of point 3. As I > > > > > > > > > > note earlier, the spec states that "If remote wake event occurs in > > > > > > > > > > multiple functions, each function shall send a Function Wake > > > > > > > > > > Notification." > > > > > > > > > > > > > > > > > > > > But if there's no remote wake event, and the host brought the device up > > > > > > > > > > instead, then the function suspend state is retained. > > > > > > > > > > > > > > > > > > > > If we separate these 2 operations, the caller can check whether the > > > > > > > > > > operation went through properly. For example, if the wakeup() is > > > > > > > > > > initiated properly, but the function wake device notification didn't go > > > > > > > > > > through. We would only need to resend the device notification rather > > > > > > > > > > than initiate remote wakeup again. > > > > > > > > > > > > > > > > > > If we don't have to send device notification for other interfaces, we > > > > > > > > > can combine the operations here as you did. > > > > > > > > > > > > > > > > > > > > > > > > > I still think it's better to split up the operations. The way you're > > > > > > > > handling it now is not clear. > > > > > > > > > > > > > > > > If the func_awake() returns -EAGAIN, I'd expect that the remote wake did > > > > > > > > not go through and expect user to retry again. But here it does initiate > > > > > > > > remote wake, but it just does not send device notification yet. This is > > > > > > > > confusing. > > > > > > > > > > > > > > > > Also, instead of all the function wake handling coming from the function > > > > > > > > driver, now we depend on the controller driver to call function resume() > > > > > > > > on state change to U0, which will trigger device notification. What > > > > > > > > happen if it doesn't call resume(). There's too many dependencies and it > > > > > > > > seems fragile. > > > > > > > > > > > > > > > > I think all this can be handled in the function driver. You can fix the > > > > > > > > dwc3 wakeup() and poll for U0/ON state rather than RECOVERY state, which > > > > > > > > is what it's supposed to poll. > > > > > > > > > > > > > > For transitioning from U3 to U0, the current upstream implementation is > > > > > > > to poll for U0 state when dwc3_gadget_wakeup() is called and it is a > > > > > > > blocking call. (this is a common API for both HS and SS) > > > > > > > > > > > > > > /* poll until Link State changes to ON */ > > > > > > > retries = 20000; > > > > > > > > > > > > > > while (retries--) { > > > > > > > reg = dwc3_readl(dwc->regs, DWC3_DSTS); > > > > > > > > > > > > > > /* in HS, means ON */ > > > > > > > if (DWC3_DSTS_USBLNKST(reg) == DWC3_LINK_STATE_U0) > > > > > > > break; > > > > > > > } > > > > > > > > > > > > > > In my experiments I found that certain hosts take longer time to drive > > > > > > > HS resume signalling between the remote wakeup and the resume K and this > > > > > > > time varies across hosts. Hence the above polling timer is not generic > > > > > > > across hosts. So how do we converge on a polling timer value to work > > > > > > > across HS/SS and without blocking the system for a long time? > > > > > > > > > > > > Can't we take the upper limit of both base on experiment? And it > > > > > > shouldn't be blocking the whole system. > > > > > > > > > > On the host I was experimenting with, the time it took was around 110ms in > > > > > HS case. That would translate to a retry count of about ~181,000 in the > > > > > above polling loop. Wouldn't that be a very large value for polling? > > > > > And not sure if there are other hosts that take even longer time > > > > > > > > We don't want to poll that many times. We shouldn't depend on the delay > > > > of a register read for polling interval. Can't we just sleep in between > > > > interval at a reasonable interval. > > > > > > > > > > Sleeping is not an option as the upper layers (those beyond > > > function/composite layer) may transmit data with a lock held. > > > > > > > You can use mdelay() if it can't sleep. But if the wait is long, it > > should be allowed to sleep. > > > > > I ran into below BUG when remote wakeup was triggered via a ping() command > > > and attempted sleep while polling > > > > > > [ 88.676789][ T392] BUG: scheduling while atomic > > > [ 88.900112][ T392] Call trace: > > > <snip> > > > [ 88.912760][ T392] __schedule_bug+0x90/0x188 > > > [ 88.917335][ T392] __schedule+0x714/0xb4c > > > [ 88.930568][ T392] schedule+0x110/0x204 > > > [ 88.943620][ T392] schedule_timeout+0x94/0x134 > > > [ 88.948371][ T392] __dwc3_gadget_wakeup+0x1ac/0x558 > > > [ 88.953558][ T392] dwc3_gadget_wakeup+0x3c/0x8c > > > [ 88.958388][ T392] usb_gadget_wakeup+0x54/0x1a8 > > > [ 88.963222][ T392] eth_start_xmit+0x130/0x830 > > > [ 88.967876][ T392] xmit_one+0xf0/0x364 > > > [ 88.971913][ T392] sch_direct_xmit+0x188/0x3e4 > > > [ 88.976663][ T392] __dev_xmit_skb+0x480/0x984 > > > [ 88.981319][ T392] __dev_queue_xmit+0x2d4/0x7d8 > > > [ 88.986151][ T392] neigh_resolve_output+0x208/0x2f0 > > > <snip> > > > > > > The above experiment was done by removing spin_locks if any in the wakeup() > > > path of function/composite/controller drivers. It is running in atomic > > > context due to the lock held by linux networking stack/generic packet > > > scheduler. > > > > > > So below are the only two approaches I can think of for dwc3_gadget_wakeup() > > > API > > > > > > 1.)Polling based approach: We poll until the link comes up. But cannot sleep > > > while polling due to above reasons. > > > > > > 2.)Interrupt based approach (the patches being discussed currently): When a > > > remote wakeup is triggered enable link state interrupts and return right > > > away. The function/composite drivers are later notified about the ON event > > > via resume callback (in a similar way how we notify U3 to composite layer > > > via suspend callback). > > > > > > Please let me know if there is any alternate way that you can think of here. > > > > > > > The main issue we're trying to solve is knowing if the host had woken up > > and the device notification is sent so that the function driver can > > resume(). > > > > If we can say that upon usb_gadget_wakeup() successful completion, the > > link is in U0/ON, then the function driver can send the wake > > notification after and resume(). That is, we're trying to make > > usb_gadget_wakeup() synchronous. Whether it's polling or interrupt > > based, it's implementation detail. > > > > Unfortunately, the API isn't very clear whether usb_gadget_wakeup() may > > sleep or synchronous. > > > > Here are 3 approaches that we can have: > > > > 1) Clarify that usb_gadget_wakeup() is synchronous and the link will be > > in U0/ON upon successful completion, and clarify whether it can sleep. > > Introduce a separate API usb_gadget_send_wake_notification() to send > > wake notification separately. > > > > 2) Create a new API usb_gadget_function_wakeup() that will combine both > > device wakeup and wake notification. The function can sleep, > > synchronous, and both wakeup and wake notification are done after the > > function completes. > > > > 3) Create a new API usb_gadget_function_wakeup() that will combine both > > device wakeup and wake notification. The function is asynchronous. We > > won't know if the wakeup is successfully sent, but we don't care and > > proceed with the function proceed with resume() anyway. > > > > BR, > > Thinh > > Thank you for your suggestions. > For handling function wakeup (applicable to enhanced super-speed) will > implement a separate API usb_gadget_function_wakeup() that combines > device-wakeup and wake-notification like below in dwc3 driver and operates > synchronously. > > dwc3_gadget_func_wakeup() > { > if (link in U3) > Call dwc3_gadget_wakeup() > Poll for U0 > > > If U0 successful, send wake_notification > > } > > Once the function completes both device wake and func wakeup notification > are done. > > For high speed use-cases when usb_gadget_wakeup() is called from function > drivers, considering the possibility of significant delay associated with > remote wakeup, dwc3_gadget_wakeup() should operate asynchronously. > i.e rely on link status change events rather than sleeping/polling. > > Please let me know if there are any concerns. If not will upload new patch > sets with this change and other changes suggested. > That sounds good to me. Thanks, Thinh