Hi, Thinh Nguyen <Thinh.Nguyen@xxxxxxxxxxxx> writes: >> (Thinh, for whatever I didn't receive your email via the list, replying >> to myself) > > Could it be because of the attached file? > >> >> Felipe Balbi <balbi@xxxxxxxxxx> writes: >>>> Hmm, can you apply this patch and provide output when the failure >>>> happens? I suggest setting up a 100MiB trace buffer. You don't need to >>>> enable any tracers nor any event. trace_printk() is always enabled. >>> >>> Sure. >>> >>> Attached is a 7z with the full kernel trace with the failure. >> >> Thank you :-) >> >>> More often than not, we will see something like below, where the TH >>> coming in back to back with 0 event count on the second TH. >>> >>> 2990.496588: dwc3_interrupt: TH >>> 2990.496588: dwc3_interrupt: evt->flags 00000000 >>> 2990.496590: dwc3_readl: addr ffffc90001b2840c value 00000004 >>> 2990.496590: dwc3_interrupt: GEVNTCOUNT 00000004 >>> 2990.496590: dwc3_interrupt: 4 events pending >>> 2990.496591: dwc3_readl: addr ffffc90001b28408 value 00001000 >>> 2990.496591: dwc3_interrupt: GEVNTSIZ 00001000 >>> 2990.496592: dwc3_writel: addr ffffc90001b28408 value 80001000 >>> 2990.496592: dwc3_interrupt: copying events to cache >>> 2990.496592: dwc3_writel: addr ffffc90001b2840c value 00000004 >>> 2990.496614: dwc3_interrupt: TH >>> 2990.496614: dwc3_interrupt: evt->flags 00000001 >>> 2990.496615: dwc3_readl: addr ffffc90001b2840c value 00000000 >>> 2990.496615: dwc3_interrupt: GEVNTCOUNT 00000000 >>> 2990.496616: dwc3_interrupt: 0 events pending >>> 2990.496616: dwc3_thread_interrupt: BH >>> 2990.496616: dwc3_event: event (00006084): ep1out: Transfer In-Progress >>> >>> >>> Here is the snippet of the attached trace where the TH is called twice >>> with a new event coming in and overwrite the old one: >>> >>> 2990.601357: dwc3_readl: addr ffffc90001b2883c value 00030007 >>> 2990.601357: dwc3_gadget_ep_cmd: ep1in: cmd 'Update Transfer' [196615].. >>> 2990.601360: dwc3_interrupt: TH >>> 2990.601360: dwc3_interrupt: evt->flags 00000000 >>> 2990.601362: dwc3_readl: addr ffffc90001b2840c value 00000004 >>> 2990.601362: dwc3_interrupt: GEVNTCOUNT 00000004 >>> 2990.601362: dwc3_interrupt: 4 events pending >>> 2990.601364: dwc3_readl: addr ffffc90001b28408 value 00001000 >>> 2990.601364: dwc3_interrupt: GEVNTSIZ 00001000 >>> 2990.601364: dwc3_writel: addr ffffc90001b28408 value 80001000 >>> 2990.601364: dwc3_interrupt: copying events to cache >>> 2990.601364: dwc3_writel: addr ffffc90001b2840c value 00000004 >>> 2990.601369: dwc3_interrupt: TH >>> 2990.601369: dwc3_interrupt: evt->flags 00000001 >>> 2990.601369: dwc3_readl: addr ffffc90001b2840c value 00000004 >>> 2990.601370: dwc3_interrupt: GEVNTCOUNT 00000004 >>> 2990.601370: dwc3_interrupt: 4 events pending >>> 2990.601372: dwc3_readl: addr ffffc90001b28408 value 80001000 >>> 2990.601372: dwc3_interrupt: GEVNTSIZ 80001000 >>> 2990.601372: dwc3_writel: addr ffffc90001b28408 value 80001000 >>> 2990.601372: dwc3_interrupt: copying events to cache >>> 2990.601372: dwc3_writel: addr ffffc90001b2840c value 00000004 >>> 2990.601375: dwc3_thread_interrupt: BH >>> 2990.601376: dwc3_event: event (00004086): ep1in: Transfer In-Progress.. >>> >>> The device hangs at the end of the log. > > I repeated the test with USB 3.0 IP version 2.90a and USB 3.1 IP version > 1.60, and I found the same result where the TH is occasionally called > twice back-to-back. > >> >> Right, found this in the logs. So the only thing that I can think of >> here, is that we need to flush posted writes. Does this help? >> >> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c >> index 79e7a3480d51..6f06738273f2 100644 >> --- a/drivers/usb/dwc3/gadget.c >> +++ b/drivers/usb/dwc3/gadget.c >> @@ -3099,6 +3099,10 @@ static irqreturn_t dwc3_check_event_buf(struct dwc3_event_buffer *evt) >> >> dwc3_writel(dwc->regs, DWC3_GEVNTCOUNT(0), count); >> >> + /* flush posted writes */ >> + dwc3_readl(dwc->regs, DWC3_GEVNTCOUNT(0)); >> + dwc3_readl(dwc->regs, DWC3_GEVNTSIZ(0)); >> + >> return IRQ_WAKE_THREAD; >> } >> >> > > With the change above, I don't see the TH get called twice in in a row. > However, I also tested with replacing the code above with a udelay(4), > from reading the kernel trace, to take in account the latency of reading > the registers. I also don't see TH get called twice in a row. > > The extra latency from reading the registers above can mask this issue. okay, so it is a problem with posted writes. Can you send above as a formal patch? That's something we can backport to every stable kernel. And the real reason here _IS_ caused by posted writes. There's no spurious interrupt happening. -- balbi
Attachment:
signature.asc
Description: PGP signature