Hi Felipe, On 4/21/2017 10:26 AM, John Youn wrote: > On 04/18/2017 05:47 AM, Felipe Balbi wrote: >> >> Hi, >> >> Thinh Nguyen <Thinh.Nguyen@xxxxxxxxxxxx> writes: >>>> (Thinh, for whatever I didn't receive your email via the list, replying >>>> to myself) >>> >>> Could it be because of the attached file? >>> >>>> >>>> Felipe Balbi <balbi@xxxxxxxxxx> writes: >>>>>> Hmm, can you apply this patch and provide output when the failure >>>>>> happens? I suggest setting up a 100MiB trace buffer. You don't need to >>>>>> enable any tracers nor any event. trace_printk() is always enabled. >>>>> >>>>> Sure. >>>>> >>>>> Attached is a 7z with the full kernel trace with the failure. >>>> >>>> Thank you :-) >>>> >>>>> More often than not, we will see something like below, where the TH >>>>> coming in back to back with 0 event count on the second TH. >>>>> >>>>> 2990.496588: dwc3_interrupt: TH >>>>> 2990.496588: dwc3_interrupt: evt->flags 00000000 >>>>> 2990.496590: dwc3_readl: addr ffffc90001b2840c value 00000004 >>>>> 2990.496590: dwc3_interrupt: GEVNTCOUNT 00000004 >>>>> 2990.496590: dwc3_interrupt: 4 events pending >>>>> 2990.496591: dwc3_readl: addr ffffc90001b28408 value 00001000 >>>>> 2990.496591: dwc3_interrupt: GEVNTSIZ 00001000 >>>>> 2990.496592: dwc3_writel: addr ffffc90001b28408 value 80001000 >>>>> 2990.496592: dwc3_interrupt: copying events to cache >>>>> 2990.496592: dwc3_writel: addr ffffc90001b2840c value 00000004 >>>>> 2990.496614: dwc3_interrupt: TH >>>>> 2990.496614: dwc3_interrupt: evt->flags 00000001 >>>>> 2990.496615: dwc3_readl: addr ffffc90001b2840c value 00000000 >>>>> 2990.496615: dwc3_interrupt: GEVNTCOUNT 00000000 >>>>> 2990.496616: dwc3_interrupt: 0 events pending >>>>> 2990.496616: dwc3_thread_interrupt: BH >>>>> 2990.496616: dwc3_event: event (00006084): ep1out: Transfer In-Progress >>>>> >>>>> >>>>> Here is the snippet of the attached trace where the TH is called twice >>>>> with a new event coming in and overwrite the old one: >>>>> >>>>> 2990.601357: dwc3_readl: addr ffffc90001b2883c value 00030007 >>>>> 2990.601357: dwc3_gadget_ep_cmd: ep1in: cmd 'Update Transfer' [196615].. >>>>> 2990.601360: dwc3_interrupt: TH >>>>> 2990.601360: dwc3_interrupt: evt->flags 00000000 >>>>> 2990.601362: dwc3_readl: addr ffffc90001b2840c value 00000004 >>>>> 2990.601362: dwc3_interrupt: GEVNTCOUNT 00000004 >>>>> 2990.601362: dwc3_interrupt: 4 events pending >>>>> 2990.601364: dwc3_readl: addr ffffc90001b28408 value 00001000 >>>>> 2990.601364: dwc3_interrupt: GEVNTSIZ 00001000 >>>>> 2990.601364: dwc3_writel: addr ffffc90001b28408 value 80001000 >>>>> 2990.601364: dwc3_interrupt: copying events to cache >>>>> 2990.601364: dwc3_writel: addr ffffc90001b2840c value 00000004 >>>>> 2990.601369: dwc3_interrupt: TH >>>>> 2990.601369: dwc3_interrupt: evt->flags 00000001 >>>>> 2990.601369: dwc3_readl: addr ffffc90001b2840c value 00000004 >>>>> 2990.601370: dwc3_interrupt: GEVNTCOUNT 00000004 >>>>> 2990.601370: dwc3_interrupt: 4 events pending >>>>> 2990.601372: dwc3_readl: addr ffffc90001b28408 value 80001000 >>>>> 2990.601372: dwc3_interrupt: GEVNTSIZ 80001000 >>>>> 2990.601372: dwc3_writel: addr ffffc90001b28408 value 80001000 >>>>> 2990.601372: dwc3_interrupt: copying events to cache >>>>> 2990.601372: dwc3_writel: addr ffffc90001b2840c value 00000004 >>>>> 2990.601375: dwc3_thread_interrupt: BH >>>>> 2990.601376: dwc3_event: event (00004086): ep1in: Transfer In-Progress.. >>>>> >>>>> The device hangs at the end of the log. >>> >>> I repeated the test with USB 3.0 IP version 2.90a and USB 3.1 IP version >>> 1.60, and I found the same result where the TH is occasionally called >>> twice back-to-back. >>> >>>> >>>> Right, found this in the logs. So the only thing that I can think of >>>> here, is that we need to flush posted writes. Does this help? >>>> >>>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c >>>> index 79e7a3480d51..6f06738273f2 100644 >>>> --- a/drivers/usb/dwc3/gadget.c >>>> +++ b/drivers/usb/dwc3/gadget.c >>>> @@ -3099,6 +3099,10 @@ static irqreturn_t dwc3_check_event_buf(struct dwc3_event_buffer *evt) >>>> >>>> dwc3_writel(dwc->regs, DWC3_GEVNTCOUNT(0), count); >>>> >>>> + /* flush posted writes */ >>>> + dwc3_readl(dwc->regs, DWC3_GEVNTCOUNT(0)); >>>> + dwc3_readl(dwc->regs, DWC3_GEVNTSIZ(0)); >>>> + >>>> return IRQ_WAKE_THREAD; >>>> } >>>> >>>> >>> >>> With the change above, I don't see the TH get called twice in in a row. >>> However, I also tested with replacing the code above with a udelay(4), >>> from reading the kernel trace, to take in account the latency of reading >>> the registers. I also don't see TH get called twice in a row. >>> >>> The extra latency from reading the registers above can mask this issue. >> >> okay, so it is a problem with posted writes. Can you send above as a >> formal patch? That's something we can backport to every stable >> kernel. And the real reason here _IS_ caused by posted writes. There's >> no spurious interrupt happening. >> > > Hi Felipe, > > I only mentioned spurious interrupts as something the driver needs to > account for anyways. They are not the cause of this issue. And I don't > think the PCIe posted writes are either becuase of what I said before > about the interrupts being packets and the latency in assertion and > detection. > > But anyways, after thorough testing we don't see this in our system > anymore with the read-back (probably due to the 4 us delay that's > introduced). We can submit a patch for this for stable. > > However I don't think this is a sufficient fix for mainline, and dwc3 > should be able to handle multiple interrupts before BH, whatever the > reason for it. > > How do you want to handle mainline? Either Thinh's original patch, > create a queue for the events, or some other way? We can work on this > fix too if you want. > > Thanks, > John > Do you have any comment regarding John's observation? Let me know so I can proceed with updating the patch or not. Thanks, Thinh -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html