On Thu, Mar 07, 2024, Michael Grzeschik wrote: > On Thu, Mar 07, 2024 at 01:57:44AM +0000, Thinh Nguyen wrote: > > On Tue, Feb 27, 2024, Michael Grzeschik wrote: > > > On Thu, Feb 22, 2024 at 01:20:04AM +0000, Thinh Nguyen wrote: > > > > On Thu, Feb 22, 2024, Michael Grzeschik wrote: > > > > > For #2: I found an issue in the handling of the completion of requests in > > > > > the started list. When the interrupt handler is *explicitly* calling > > > > > stop_active_transfer if the overall event of the request was an missed > > > > > event. This event value only represents the value of the request that > > > > > was actually triggering the interrupt. > > > > > > > > > > It also calls ep_cleanup_completed_requests and is iterating over the > > > > > started requests and will call giveback/complete functions of the > > > > > requests with the proper request status. > > > > > > > > > > So this will also catch missed requests in the queue. However, since > > > > > there might be, lets say 5 good requests and one missed request, what > > > > > will happen is, that each complete call for the first good requests will > > > > > enqueue new requests into the started list and will also call the > > > > > updatecmd on that transfer that was already missed until the loop will > > > > > reach the one request with the MISSED status bit set. > > > > > > > > > > So in my opinion the patch from Jeff makes sense when adding the > > > > > following change aswell. With those both changes the underruns and > > > > > broken frames finally disappear. I am still unsure about the complete > > > > > solution about that, since with this the mentioned 5 good requests > > > > > will be cancelled aswell. So this is still a WIP status here. > > > > > > > > > > > > > When the dwc3 driver issues stop_active_transfer(), that means that the > > > > started_list is empty and there is an underrun. > > > > > > At this moment this is only the case when both, pending and started list > > > are empty. Or the interrupt event was EXDEV. > > > > > > The main problem is that the function > > > dwc3_gadget_ep_cleanup_completed_requests(dep, event, status); will > > > issue an complete for each started request, which on the other hand will > > > refill the pending list, and therefor after that refill the > > > stop_active_transfer is currently never hit. > > > > > > > It treats the incoming requests as staled. However, for UVC, they are > > > > still "good". > > > > > > Right, so in that case we can requeue them anyway. But this will have to > > > be done after the stop transfer cmd has finished. > > > > > > > I think you can just check if the started_list is empty before queuing > > > > new requests. If it is, perform stop_active_transfer() to reschedule the > > > > incoming requests. None of the newly queue requests will be released > > > > yet since they are in the pending_list. > > > > > > So that is basically exactly what my patch is doing. However in the case > > > of an underrun it is not safe to call dwc3_gadget_ep_cleanup_completed_requests > > > as jeff stated. So his underlying patch is really fixing an issue here. > > > > What I mean is to actively check for started list on every > > usb_ep_queue() call. Checking during > > dwc3_gadget_ep_cleanup_completed_requests() is already too late. > > I see. > > > > > > > > For UVC, perhaps you can introduce a new flag to usb_request called > > > > "ignore_queue_latency" or something equivalent. The dwc3 is already > > > > partially doing this for UVC. With this new flag, we can rework dwc3 to > > > > clearly separate the expected behavior from the function driver. > > > > > > I don't know why this "extra" flag is even necessary. The code example > > > is already working without that extra flag. > > > > The flag is for controller to determine what kinds of behavior the > > function driver expects. My intention is if this extra flag is not set, > > the dwc3 driver will not attempt to reshcedule isoc request at all (ie. > > no stop_active_transfer()). > > Ok. > > > > > > > Actually I even came up with an better solution. Additionally of checking if > > > one of the requests in the started list was missed, we can activly check if > > > the trb ring did run dry and if dwc3_gadget_endpoint_trbs_complete is > > > going to enqueue in to the empty trb ring. > > > > > > So my whole change looks like that: > > > > > > diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h > > > index efe6caf4d0e87..2c8047dcd1612 100644 > > > --- a/drivers/usb/dwc3/core.h > > > +++ b/drivers/usb/dwc3/core.h > > > @@ -952,6 +952,7 @@ struct dwc3_request { > > > #define DWC3_REQUEST_STATUS_DEQUEUED 3 > > > #define DWC3_REQUEST_STATUS_STALLED 4 > > > #define DWC3_REQUEST_STATUS_COMPLETED 5 > > > +#define DWC3_REQUEST_STATUS_MISSED_ISOC 6 > > > #define DWC3_REQUEST_STATUS_UNKNOWN -1 > > > u8 epnum; > > > diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c > > > index 858fe4c299b7a..a31f4d3502bd3 100644 > > > --- a/drivers/usb/dwc3/gadget.c > > > +++ b/drivers/usb/dwc3/gadget.c > > > @@ -2057,6 +2057,9 @@ static void dwc3_gadget_ep_cleanup_cancelled_requests(struct dwc3_ep *dep) > > > req = next_request(&dep->cancelled_list); > > > dwc3_gadget_ep_skip_trbs(dep, req); > > > switch (req->status) { > > > + case 0: > > > + dwc3_gadget_giveback(dep, req, 0); > > > + break; > > > case DWC3_REQUEST_STATUS_DISCONNECTED: > > > dwc3_gadget_giveback(dep, req, -ESHUTDOWN); > > > break; > > > @@ -2066,6 +2069,9 @@ static void dwc3_gadget_ep_cleanup_cancelled_requests(struct dwc3_ep *dep) > > > case DWC3_REQUEST_STATUS_STALLED: > > > dwc3_gadget_giveback(dep, req, -EPIPE); > > > break; > > > + case DWC3_REQUEST_STATUS_MISSED_ISOC: > > > + dwc3_gadget_giveback(dep, req, -EXDEV); > > > + break; > > > default: > > > dev_err(dwc->dev, "request cancelled with wrong reason:%d\n", req->status); > > > dwc3_gadget_giveback(dep, req, -ECONNRESET); > > > @@ -3509,6 +3515,36 @@ static int dwc3_gadget_ep_cleanup_completed_request(struct dwc3_ep *dep, > > > return ret; > > > } > > > +static int dwc3_gadget_ep_check_missed_requests(struct dwc3_ep *dep) > > > +{ > > > + struct dwc3_request *req; > > > + struct dwc3_request *tmp; > > > + int ret = 0; > > > + > > > + list_for_each_entry_safe(req, tmp, &dep->started_list, list) { > > > + struct dwc3_trb *trb; > > > + > > > + trb = req->trb; > > > + switch (DWC3_TRB_SIZE_TRBSTS(trb->size)) { > > > + case DWC3_TRBSTS_MISSED_ISOC: > > > + /* Isoc endpoint only */ > > > + ret = -EXDEV; > > > + break; > > > + case DWC3_TRB_STS_XFER_IN_PROG: > > > + /* Applicable when End Transfer with ForceRM=0 */ > > > + case DWC3_TRBSTS_SETUP_PENDING: > > > + /* Control endpoint only */ > > > + case DWC3_TRBSTS_OK: > > > + default: > > > + ret = 0; > > > + break; > > > + } > > > + } > > > + > > > + return ret; > > > +} > > > + > > > static void dwc3_gadget_ep_cleanup_completed_requests(struct dwc3_ep *dep, > > > const struct dwc3_event_depevt *event, int status) > > > { > > > @@ -3565,22 +3601,51 @@ static bool dwc3_gadget_endpoint_trbs_complete(struct dwc3_ep *dep, > > > { > > > struct dwc3 *dwc = dep->dwc; > > > bool no_started_trb = true; > > > + unsigned int transfer_in_flight = 0; > > > + > > > + /* It is possible that the interrupt thread was delayed by > > > + * scheduling in the system, and therefor the HW has already > > > + * run dry. In that case the last trb in the queue is already > > > + * handled by the hw. By checking the HWO bit we know to restart > > > + * the whole transfer. The condition to appear is more likelely > > > + * if not every trb has the IOC bit set and therefor does not > > > + * trigger the interrupt thread fewer. > > > + */ > > > + if (dep->number && usb_endpoint_xfer_isoc(dep->endpoint.desc)) { > > > + struct dwc3_trb *trb; > > > - dwc3_gadget_ep_cleanup_completed_requests(dep, event, status); > > > + trb = dwc3_ep_prev_trb(dep, dep->trb_enqueue); > > > + transfer_in_flight = trb->ctrl & DWC3_TRB_CTRL_HWO; > > > + } > > > - if (dep->flags & DWC3_EP_END_TRANSFER_PENDING) > > > - goto out; > > > + if (status == -EXDEV || !transfer_in_flight) { > > > + struct dwc3_request *tmp; > > > + struct dwc3_request *req; > > > - if (!dep->endpoint.desc) > > > - return no_started_trb; > > > + if (!(dep->flags & DWC3_EP_END_TRANSFER_PENDING)) > > > + dwc3_stop_active_transfer(dep, true, true); > > > - if (usb_endpoint_xfer_isoc(dep->endpoint.desc) && > > > - list_empty(&dep->started_list) && > > > - (list_empty(&dep->pending_list) || status == -EXDEV)) > > @[!!here!!] > > > > - dwc3_stop_active_transfer(dep, true, true); > > > - else if (dwc3_gadget_ep_should_continue(dep)) > > > - if (__dwc3_gadget_kick_transfer(dep) == 0) > > > - no_started_trb = false; > > > + list_for_each_entry_safe(req, tmp, &dep->started_list, list) { > > > + dwc3_gadget_move_cancelled_request(req, > > > + (DWC3_TRB_SIZE_TRBSTS(req->trb->size) == DWC3_TRBSTS_MISSED_ISOC) ? > > > + DWC3_REQUEST_STATUS_MISSED_ISOC : 0); > > > + } > > > + } else { > > > + dwc3_gadget_ep_cleanup_completed_requests(dep, event, status); > > > + > > > + if (dep->flags & DWC3_EP_END_TRANSFER_PENDING) > > > + goto out; > > > + > > > + if (!dep->endpoint.desc) > > > + return no_started_trb; > > > + > > > + if (usb_endpoint_xfer_isoc(dep->endpoint.desc) && > > > + list_empty(&dep->started_list) && list_empty(&dep->pending_list)) > > > + dwc3_stop_active_transfer(dep, true, true); > > > + else if (dwc3_gadget_ep_should_continue(dep)) > > > + if (__dwc3_gadget_kick_transfer(dep) == 0) > > > + no_started_trb = false; > > > + } > > > out: > > > /* > > > > > > I will seperate the whole hunk into smaller changes and send an v1 > > > the next days to review. > > > > > I finally send a v1 of my series. > > https://lore.kernel.org/linux-usb/20240307-dwc3-gadget-complete-irq-v1-0-4fe9ac0ba2b7@xxxxxxxxxxxxxx/ > > For the rest of the discussion, I would like to move the conversation to > the newly send series. I saw your pushes. Thanks. I'll review and move the discussion there. > > > No, we should not reschedule for every missed-isoc. We only want to > > target underrun condition. > > As you stated above, with reschedule what you mean is calling > stop_transfer after a missed transfer was seen? > > If so, why is this condition in there already? (@[!!here!!]) > It's only to reschedule if started_list is empty _and_ if there's either no pending request or there's a missed isoc. Not for every missed isoc. BR, Thinh