Hi, John Stultz <john.stultz@xxxxxxxxxx> writes: > On Wed, Jan 22, 2020 at 11:23 PM Felipe Balbi <balbi@xxxxxxxxxx> wrote: >> > From: Anurag Kumar Vulisha <anurag.kumar.vulisha@xxxxxxxxxx> >> > >> > The present code in dwc3_gadget_ep_reclaim_completed_trb() will check >> > for IOC/LST bit in the event->status and returns if IOC/LST bit is >> > set. This logic doesn't work if multiple TRBs are queued per >> > request and the IOC/LST bit is set on the last TRB of that request. >> > Consider an example where a queued request has multiple queued TRBs >> > and IOC/LST bit is set only for the last TRB. In this case, the Core >> > generates XferComplete/XferInProgress events only for the last TRB >> > (since IOC/LST are set only for the last TRB). As per the logic in >> > dwc3_gadget_ep_reclaim_completed_trb() event->status is checked for >> > IOC/LST bit and returns on the first TRB. This makes the remaining >> > TRBs left unhandled. >> > To aviod this, changed the code to check for IOC/LST bits in both >> avoid >> >> > event->status & TRB->ctrl. This patch does the same. >> >> We don't need to check both. It's very likely that checking the TRB is >> enough. > > Sorry, just to clarify, are you suggesting instead of: > - if (event->status & DEPEVT_STATUS_IOC) > + if ((event->status & DEPEVT_STATUS_IOC) && > + (trb->ctrl & DWC3_TRB_CTRL_IOC)) > > > We do something like: > - if (event->status & DEPEVT_STATUS_IOC) > + if (trb->ctrl & DWC3_TRB_CTRL_IOC) > + return 1; > + > + if (trb->ctrl & DWC3_TRB_CTRL_LST) > return 1; > > ? that's correct. In hindsight, I have no idea why I used the event->status here since all other checks are done against the TRB only. >> > At a practical level, this patch resolves USB transfer stalls seen >> > with adb on dwc3 based HiKey960 after functionfs gadget added >> > scatter-gather support around v4.20. >> >> Right, I remember asking for tracepoint data showing this problem >> happening. It's the best way to figure out what's really going on. >> >> Before we accept these two patches, could you collect dwc3 tracepoint >> data and share here? > > Sure. Attached is trace logs and regdumps for hikey960. Thanks > The one gotcha with the logs is that in the working case (with this > patch applied), I booted with the usb-c cable disconnected (as > suggested in the dwc3.rst doc), enabled tracing and plugged in the > device, then ran adb logcat a few times to validate no stalls. > > In the failure case (without this patch), I booted with the usb-c > cable disconnected, enabled tracing and then when I plugged in the > device, it never was detected by adb (it seems perhaps the problem had > already struck?). You never got a Reset Interrupt, so something else is going on. I suggest putting a sniffer and first making sure the host *does* drive reset signalling. Second step would be to look at your phy configuration. Is it going in suspend for any reason? Might want to try our snps,dis_u3_susphy_quirk and snps,dis_u2_susphy_quirk flags. > So I generated the failure2 log by booting with USB-C plugged in, > enabling tracing, and running adb logcat on the host to observe the > stall. Thank you. Here's a quick summary of what's in failure2: There is a series of 24-byte transfers on ep1out and that's the one which shows a problem. We can clearly see that adb is issuing one transfer at a time, only enqueueing transfer n+1 when transfer n is completed and given back, so we see a series of similar blocks: - dwc3_alloc_request - dwc3_ep_queue - dwc3_prepare_trb - dwc3_prepare_trb (for the chained bit) - dwc3_gadget_ep_cmd (update transfer) - dwc3_event (transfer in progress) - dwc3_complete_trb - dwc3_complete_trb (for the chained bit) - dwc3_gadget_giveback - dwc3_free_request So this works for several iterations. Note, however, that the TRB addresses don't really make sense. DWC3 allocates a contiguous block of memory to server as TRB pool, but we see non-consecutive addresses on these TRBs. I'm assuming there's an IOMMU in your system. Anyway, the failing point is here: > adbd-461 [002] d..1 49.855992: dwc3_alloc_request: ep1out: req 000000004e6eaaba length 0/0 zsI ==> 0 > adbd-461 [002] d..2 49.855994: dwc3_ep_queue: ep1out: req 000000004e6eaaba length 0/24 zsI ==> -115 > adbd-461 [002] d..2 49.855996: dwc3_prepare_trb: ep1out: trb 00000000bae39b48 buf 000000009eb0b100 size 24 ctrl 0000001d (HlCS:sc:normal) > adbd-461 [002] d..2 49.855997: dwc3_prepare_trb: ep1out: trb 000000009093a074 buf 0000000217da8000 size 488 ctrl 00000819 (HlcS:sC:normal) > adbd-461 [002] d..2 49.856003: dwc3_gadget_ep_cmd: ep1out: cmd 'Update Transfer' [20007] params 00000000 00000000 00000000 --> status: Successful > irq/65-dwc3-498 [000] d..1 53.902752: dwc3_event: event (00006084): ep1out: Transfer In Progress [0] (SIm) > irq/65-dwc3-498 [000] d..1 53.902763: dwc3_complete_trb: ep1out: trb 00000000bae39b48 buf 000000009eb0b100 size 0 ctrl 0000001c (hlCS:sc:normal) > irq/65-dwc3-498 [000] d..1 53.902769: dwc3_complete_trb: ep1out: trb 000000009093a074 buf 0000000217da8000 size 488 ctrl 00000819 (HlcS:sC:normal) > irq/65-dwc3-498 [000] d..1 53.902781: dwc3_gadget_giveback: ep1out: req 000000004e6eaaba length 24/24 zsI ==> 0 > kworker/u16:0-7 [000] .... 53.903020: dwc3_free_request: ep1out: req 000000004e6eaaba length 24/24 zsI ==> 0 > adbd-461 [002] d..1 53.903273: dwc3_alloc_request: ep1out: req 00000000c769beab length 0/0 zsI ==> 0 > adbd-461 [002] d..2 53.903285: dwc3_ep_queue: ep1out: req 00000000c769beab length 0/24 zsI ==> -115 > adbd-461 [002] d..2 53.903292: dwc3_prepare_trb: ep1out: trb 00000000f0ffa827 buf 000000009eb11e80 size 24 ctrl 0000001d (HlCS:sc:normal) > adbd-461 [002] d..2 53.903296: dwc3_prepare_trb: ep1out: trb 00000000d6a9892a buf 0000000217da8000 size 488 ctrl 00000819 (HlcS:sC:normal) > adbd-461 [002] d..2 53.903315: dwc3_gadget_ep_cmd: ep1out: cmd 'Update Transfer' [20007] params 00000000 00000000 00000000 --> status: Successful Note that this transfer, after started, took 4 seconds to complete, while all others completed within a few ms. There's no real reason for this visible from dwc3 driver itself. What follows, is a transfer that never completed. The only thing I can come up with, is that we starve the TRB ring, by continuously reclaiming a single TRB. We have 255 usable TRBs, so after a few iterations, we would see a stall due to starved TRB ring. There is a way to verify this by tracking trb_enqueue and trb_dequeue, if you're willing to do that, that'll help us prove that this is really the problem and, since current tracepoints doen't really show that information, it may be a good idea to add this information to dwc3_log_trb tracepoint class. Something like below should be enough, could you re-run the test of failure2 with this patch applied? drivers/usb/dwc3/trace.h | 9 +++++++-- modified drivers/usb/dwc3/trace.h @@ -227,6 +227,8 @@ DECLARE_EVENT_CLASS(dwc3_log_trb, __field(u32, size) __field(u32, ctrl) __field(u32, type) + __field(u32, enqueue) + __field(u32, dequeue) ), TP_fast_assign( __assign_str(name, dep->name); @@ -236,9 +238,12 @@ DECLARE_EVENT_CLASS(dwc3_log_trb, __entry->size = trb->size; __entry->ctrl = trb->ctrl; __entry->type = usb_endpoint_type(dep->endpoint.desc); + __entry->enqueue = dep->trb_enqueue + __entry->dequeue = dep->trb_dequeue ), - TP_printk("%s: trb %p buf %08x%08x size %s%d ctrl %08x (%c%c%c%c:%c%c:%s)", - __get_str(name), __entry->trb, __entry->bph, __entry->bpl, + TP_printk("%s: trb %p (E%d:D%d) buf %08x%08x size %s%d ctrl %08x (%c%c%c%c:%c%c:%s)", + __get_str(name), __entry->trb, __entry->enqueue, + __entry->dequeue, __entry->bph, __entry->bpl, ({char *s; int pcm = ((__entry->size >> 24) & 3) + 1; switch (__entry->type) { > Anyway, all three sets of logs are included. Let me know if you need > me to try anything else. Thanks for doing this -- balbi
Attachment:
signature.asc
Description: PGP signature