Hi, John Youn <John.Youn@xxxxxxxxxxxx> writes: >>> John Youn <John.Youn@xxxxxxxxxxxx> writes: >>>>> Thinh Nguyen <Thinh.Nguyen@xxxxxxxxxxxx> writes: >>>>>> The dwc3 driver can overwite its previous events if its top half IRQ >>>>>> handler gets invoked again before processing the events in the cache. We >>>>> >>>>> interrupts are masked, why would top half get invoked again? Is this, >>>>> perhaps, related to DWC3 3.00a which has the "Interrupt line doesn't >>>>> lower when masked" problem? We've added a lot of code to workaround that >>>>> problem and, apparently, it wasn't enough. >>>> >>>> No, it is not related to that. We verified with PCIe traces. The >>>> interrupt line gets deasserted after we mask it. And we put the >>>> masking as close to the beginning of the top-half as possible. >>>> >>>>> >>>>> In any case, there's no way top half would be invoked again in a >>>>> properly working DWC3. >>>> >>>> Yet we still see it sometimes. Usually it doesn't create a problem, >>> >>> that's fair, but it's not for the reason you're describing :-) There >>> might be another problem going on, because since we masked the interrupt >>> and cleared all events, IRQ shouldn't be raised at all; unless, as I >>> mentioned on the other subthread, the IRQ line is shared. >>> >>>> but if there happens to be a new event there, we get the failure. >>>> >>>> We didn't trace into that part of the kernel so we can't explain why. >>>> But if there is any chance the interrupt line deassertion wasn't >>>> detected in time, whatever part of the kernel that thinks it is still >>>> asserted might just call our top-half again. This could be a totally >>>> wrong assumption, but it doesn't seem too far-fetched. >>> >>> The kernel doesn't detect IRQ line assertion/deassertion. CPU gets an >>> exception when that happens and calls Kernel IRQ handler vector. That >>> will, in turn, figure out which line triggered, call the handler and so >>> on. >> >> We're talking about PCIe though, where interrupt assertion and >> deassertion are packets. So I would imagine the kernel has to do >> something and there could be some latency associated with that. > > Also, another thing is that the device uses legacy, level-triggered, > PCIe interrupts, so for as long as the interrupt is asserted, the TH > is called repeatedly. yes, and that's why we have: > static irqreturn_t dwc3_check_event_buf(struct dwc3_event_buffer *evt) > { > struct dwc3 *dwc = evt->dwc; > u32 amount; > u32 count; > u32 reg; > if (pm_runtime_suspended(dwc->dev)) { > pm_runtime_get(dwc->dev); > disable_irq_nosync(dwc->irq_gadget); > dwc->pending_events = true; > return IRQ_HANDLED; > } > > count = dwc3_readl(dwc->regs, DWC3_GEVNTCOUNT(0)); > count &= DWC3_GEVNTCOUNT_MASK; check how many events are pending in the event buffer. > if (!count) > return IRQ_NONE; > > evt->count = count; > evt->flags |= DWC3_EVENT_PENDING; > > /* Mask interrupt */ > reg = dwc3_readl(dwc->regs, DWC3_GEVNTSIZ(0)); > reg |= DWC3_GEVNTSIZ_INTMASK; mask interrupt generation > dwc3_writel(dwc->regs, DWC3_GEVNTSIZ(0), reg); > > amount = min(count, evt->length - evt->lpos); > memcpy(evt->cache + evt->lpos, evt->buf + evt->lpos, amount); > > if (amount < count) > memcpy(evt->cache, evt->buf, count - amount); > > dwc3_writel(dwc->regs, DWC3_GEVNTCOUNT(0), count); clear ALL events from event buffer. This brings the line down, so we shouldn't re-enter. > return IRQ_WAKE_THREAD; > } > So we mask the interrupt in the TH and a short time later, the > interrupt de-assertion packet is sent on PCIe bus and if that's not > seen right away we may already have another call to TH before the BH > gets scheduled. not sure this can happen. If that's the case, every PCI driver would have all sorts of tricks to cope with this, not only dwc3 :-) Bjorn, is this something that can happen on PCIe? Quick summary of the problem: John and Thinh are experiencing a re-entrant top-half handler even though we have cleared pending IRQ status _and_ masked Interrupts. SNPS is using an FPGA model of the latest DWC3 core under x86. I have never seen this behavior on ARM or any of the x86 devices containing this core (and this includes all the newest x86 cores, see drivers/usb/dwc3/dwc3-pci.c for PCI IDs if you care enough :-) Anyway, from my point of view, this is either a bug in IRQ subsystem which only John and Thinh can reproduce at this moment, or a regression with DWC3 IP Core :-s -- balbi
Attachment:
signature.asc
Description: PGP signature