John Stultz <john.stultz@xxxxxxxxxx> writes: > On Wed, Apr 22, 2020 at 11:32 AM John Stultz <john.stultz@xxxxxxxxxx> wrote: >> On Tue, Apr 21, 2020 at 11:00 PM Felipe Balbi <balbi@xxxxxxxxxx> wrote: >> > John Stultz <john.stultz@xxxxxxxxxx> writes: >> > > that's not an issue as we only have one sg entry. But if its set on a >> > > trb in a request with multiple sgs, that's where it seems to be >> > > causing the issue. >> > >> > The issue is completing with HWO set, which should never happen. Can you >> > collect tracepoints with linus/master of this particular situation? >> >> Will do, though again, its a little tough as often we hit the stall >> state pretty quickly at bootup, before I can even try to enable >> tracing, so it may take a few tries. >> >> > >> One interesting thing is that TRB addresses are "odd". I can't find a >> > >> proper lifetime for these TRBs. Do you have IOMMU enabled? Can you run >> > >> without it? For example, nowhere in the log can I find the place where >> > >> trb 0000000092deef41 was first enqueue. I'm assuming the log to be >> > >> ordered, which means that trb is the same as 00000000f3db4076. But why >> > >> are the addresses different? >> > >> >> > >> Another weird thing is that even though we CHN bit being set in >> > >> 0000000092deef41, we don't see where the second trb (the one its chained >> > >> to) was prepared. It seems like it was *never* prepared, what gives? >> > > >> > > I suspect these bits were due to the tracing happening after some >> > > minor amount of initial adb traffic began at bootup? So the trace >> > > isn't capturing all the events. >> > >> > No, no. That's more likely to be IOMMU mucking up the addresses. ADB is >> > very sequential in its behavior and USB transfers requests in >> > order. Please run linus/master without IOMMU. >> >> So I didn't have any IOMMU support enabled in the config I was testing >> with. I went through and added IOMMU options in my config with >> linus/master as well and that didn't seem to change the behavior >> either. >> >> I'll get back to you with further trace logs. > > Ok. Attached are trace logs. Two bad cases, which are linus/master w/ > all IOMMU configs disabled. > > Then I have a good case where I've removed the > if (trb->ctrl & DWC3_TRB_CTRL_HWO) > break; > logic in dwc3_gadget_ep_reclaim_trb_sg(). Okay, here's the failing case: UsbFfs-worker-574 [001] d..2 261.788895: dwc3_ep_queue: ep1out: req 000000006efef4fb length 0/16384 zsI ==> -115 UsbFfs-worker-574 [001] d..2 261.788922: dwc3_prepare_trb: ep1out: trb 00000000fa0a991a (E11:D4) buf 00000000ab3d1000 size 4096 ctrl 0000001d (HlCS:sc:normal) UsbFfs-worker-574 [001] d..2 261.788927: dwc3_prepare_trb: ep1out: trb 000000007f6b91bb (E12:D4) buf 00000000a9534000 size 4096 ctrl 0000001d (HlCS:sc:normal) UsbFfs-worker-574 [001] d..2 261.788930: dwc3_prepare_trb: ep1out: trb 00000000a64ab194 (E13:D4) buf 00000000d6447000 size 4096 ctrl 0000001d (HlCS:sc:normal) UsbFfs-worker-574 [001] d..2 261.788934: dwc3_prepare_trb: ep1out: trb 00000000571f893b (E14:D4) buf 00000000d666a000 size 4096 ctrl 00000819 (HlcS:sC:normal) UsbFfs-worker-574 [001] d..2 261.788962: dwc3_gadget_ep_cmd: ep1out: cmd 'Update Transfer' [20007] params 00000000 00000000 00000000 --> status: Successful irq/65-dwc3-515 [000] d..1 261.821056: dwc3_event: event (00006084): ep1out: Transfer In Progress [0] (SIm) irq/65-dwc3-515 [000] d..1 261.821057: dwc3_complete_trb: ep1out: trb 00000000fa0a991a (E22:D11) buf 00000000ab3d1000 size 4072 ctrl 0000001c (hlCS:sc:normal) So, ffs enqueued a request for 16kiB but only 24 bytes were sent by the host. This is a short-packet case. It seems like that logic is, indeed, incorrect. I added it for a reason, though I can't be sure based on the commit log alone. I think I was trying to cover the case where we didn't prepare all TRBs for an SG-list because they didn't fit, but I don't think that's the right way to achieve what I meant :-p Could you prepare a patch and Cc stable? -- balbi
Attachment:
signature.asc
Description: PGP signature