On Thu, Apr 16, 2020 at 1:19 AM Felipe Balbi <balbi@xxxxxxxxxx> wrote: > John Stultz <john.stultz@xxxxxxxxxx> writes: > > Hey Felipe, > > Last week or so, a change[1] in AOSP to adbd seemingly uncovered > > another issue with dwc3 gadget scatter-gather support on HiKey960. > > > > Interestingly it doesn't seem to affect the Dragonboard 845c, which > > uses the same dwc3 driver and has had its own issues in the past. > > > > The behavior is the same as we saw last time around with both devices. > > After booting the device, running "adb logcat -d" (or really any adb > > command that transfers more than a trivial amount of data) on the host > > will result in the adb output seeming to stall. Any further adb > > invocations to the device will hang indefinitely. > > > > I've captured trace events for before the change (works), after the > > change (broken), and after the change with the sg_enabled flag turned > > off (which works around the problem). > > > > Let me know if there is anything else useful for me to share. > > First the obvious questions: Which kernel version is this? Apologies. v5.7-rc1, though the same behavior was seen with v5.6 and v5.4 kernels. > What does > "before" and "after" refer to in our traces? Before and after the userland changes to adb were made. > What are first working and > first failing versions? Can you run git bisect? It's not directly a kernel regression. But it seems like an uncovered issue by changes in userland. > One thing I noticed is that we're missing a giveback on ep1out. Here's a > working case: > > UsbFfs-worker-580 [002] d..1 66.704886: dwc3_alloc_request: ep1out: req 0000000011c55648 length 0/0 zsI ==> 0 > UsbFfs-worker-580 [002] d..2 66.704889: dwc3_ep_queue: ep1out: req 0000000011c55648 length 0/16384 zsI ==> -115 > UsbFfs-worker-580 [002] d..2 66.704892: dwc3_prepare_trb: ep1out: trb 000000003559c11c (E27:D7) buf 000000008843b000 size 16384 ctrl 00000819 (HlcS:sC:normal) > UsbFfs-worker-580 [002] d..2 66.704897: dwc3_gadget_ep_cmd: ep1out: cmd 'Update Transfer' [20007] params 00000000 00000000 00000000 --> status: Successful > irq/65-dwc3-522 [000] d..1 66.705053: dwc3_event: event (00006084): ep1out: Transfer In Progress [0] (SIm) > irq/65-dwc3-522 [000] d..1 66.705054: dwc3_complete_trb: ep1out: trb 000000008c350fb3 (E27:D8) buf 0000000089d6b000 size 16360 ctrl 00000818 (hlcS:sC:normal) > irq/65-dwc3-522 [000] d..1 66.705058: dwc3_gadget_giveback: ep1out: req 0000000001b9ed3f length 24/16384 zsI ==> 0 > kworker/u16:2-260 [001] .... 66.705097: dwc3_free_request: ep1out: req 0000000001b9ed3f length 24/16384 zsI ==> 0 > > and the failure point: > > UsbFfs-worker-580 [002] d..1 66.705129: dwc3_alloc_request: ep1out: req 0000000067a34de4 length 0/0 zsI ==> 0 > UsbFfs-worker-580 [002] d..2 66.705131: dwc3_ep_queue: ep1out: req 0000000067a34de4 length 0/16384 zsI ==> -115 > UsbFfs-worker-580 [002] d..2 66.705134: dwc3_prepare_trb: ep1out: trb 00000000f3db4076 (E28:D8) buf 000000008843f000 size 16384 ctrl 00000819 (HlcS:sC:normal) > UsbFfs-worker-580 [002] d..2 66.705141: dwc3_gadget_ep_cmd: ep1out: cmd 'Update Transfer' [20007] params 00000000 00000000 00000000 --> status: Successful > irq/65-dwc3-522 [000] d..1 66.705309: dwc3_event: event (00006084): ep1out: Transfer In Progress [0] (SIm) > irq/65-dwc3-522 [000] d..1 66.705310: dwc3_complete_trb: ep1out: trb 0000000092deef41 (E28:D9) buf 00000000ba8f1000 size 4072 ctrl 0000001c (hlCS:sc:normal) > irq/65-dwc3-522 [000] d..1 66.705318: dwc3_gadget_ep_cmd: ep1out: cmd 'Update Transfer' [20007] params 00000000 00000000 00000000 --> status: Successful > irq/65-dwc3-522 [000] d..1 66.705323: dwc3_gadget_ep_cmd: ep1out: cmd 'Update Transfer' [20007] params 00000000 00000000 00000000 --> status: Successful > irq/65-dwc3-522 [000] d..1 66.705329: dwc3_gadget_ep_cmd: ep1out: cmd 'Update Transfer' [20007] params 00000000 00000000 00000000 --> status: Successful > irq/65-dwc3-522 [000] d..1 66.705334: dwc3_gadget_ep_cmd: ep1out: cmd 'Update Transfer' [20007] params 00000000 00000000 00000000 --> status: Successful > irq/65-dwc3-522 [000] d..1 66.705339: dwc3_gadget_ep_cmd: ep1out: cmd 'Update Transfer' [20007] params 00000000 00000000 00000000 --> status: Successful > irq/65-dwc3-522 [000] d..1 66.705344: dwc3_gadget_ep_cmd: ep1out: cmd 'Update Transfer' [20007] params 00000000 00000000 00000000 --> status: Successful > irq/65-dwc3-522 [000] d..1 66.705349: dwc3_gadget_ep_cmd: ep1out: cmd 'Update Transfer' [20007] params 00000000 00000000 00000000 --> status: Successful > irq/65-dwc3-522 [000] d..1 66.705354: dwc3_gadget_ep_cmd: ep1out: cmd 'Update Transfer' [20007] params 00000000 00000000 00000000 --> status: Successful > > One interesting thing is that TRB addresses are "odd". I can't find a > proper lifetime for these TRBs. Do you have IOMMU enabled? Can you run > without it? For example, nowhere in the log can I find the place where > trb 0000000092deef41 was first enqueue. I'm assuming the log to be > ordered, which means that trb is the same as 00000000f3db4076. But why > are the addresses different? > > Another weird thing is that even though we CHN bit being set in > 0000000092deef41, we don't see where the second trb (the one its chained > to) was prepared. It seems like it was *never* prepared, what gives? I'll try to take a deeper look at these points and get back to you. I really appreciate your eyes on this! Thanks for the feedback! -john