On 10 April 2017 at 20:43, John Youn <John.Youn@xxxxxxxxxxxx> wrote: > On 04/08/2017 12:38 PM, Janusz Dziedzic wrote: >> 2017-04-08 1:57 GMT+02:00 Thinh Nguyen <Thinh.Nguyen@xxxxxxxxxxxx>: >>> The dwc3 driver can overwite its previous events if its top half IRQ >>> handler gets invoked again before processing the events in the cache. We >>> see this as a hang in the file transfer and the host will attempt to >>> reset the device. This shouldn't be possible in the dwc3 implementation >>> if the HW and the SW are in sync. However, through testing and reading >>> the pcie trace, the top half IRQ handler occasionally still gets invoked >>> one more time after HW interrupt deassertion. We suspect that there is a >>> small detection delay in the SW. >>> >>> Top half IRQ handler deasserts the interrupt line after it gets the >>> event count by writing DWC3_GEVNTSIZ_INTMASK to DWC3_GEVNTSIZ. There's a >>> small window for a new event coming in between reading the event count >>> and the interrupt deassertion. More generally, we will see 0 event >>> count, which should not affect anything. >>> >>> To avoid this issue, we ensure that the events in the cache are >>> processed before checking for the new ones. The bottom half IRQ will >>> unmask the DWC3_GEVNTSIZ_INTMASK once it finishes processing the events. >>> Top half IRQ handler needs to check whether the DWC3_GEVNTSIZ_INTMASK is >>> still set before storing the new events. It also should return >>> IRQ_HANDLED if the mask is still set since IRQ handler was invoked for >>> our usb interrupt. >>> >>> Signed-off-by: Thinh Nguyen <thinhn@xxxxxxxxxxxx> >>> --- >>> drivers/usb/dwc3/gadget.c | 5 ++++- >>> 1 file changed, 4 insertions(+), 1 deletion(-) >>> >>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c >>> index 4dc80729ae11..93d98fb7215e 100644 >>> --- a/drivers/usb/dwc3/gadget.c >>> +++ b/drivers/usb/dwc3/gadget.c >>> @@ -3045,6 +3045,10 @@ static irqreturn_t dwc3_check_event_buf(struct dwc3_event_buffer *evt) >>> return IRQ_HANDLED; >>> } >>> >>> + reg = dwc3_readl(dwc->regs, DWC3_GEVNTSIZ(0)); >>> + if (reg & DWC3_GEVNTSIZ_INTMASK) >>> + return IRQ_HANDLED; >>> + >>> count = dwc3_readl(dwc->regs, DWC3_GEVNTCOUNT(0)); >>> count &= DWC3_GEVNTCOUNT_MASK; >>> if (!count) >>> @@ -3054,7 +3058,6 @@ static irqreturn_t dwc3_check_event_buf(struct dwc3_event_buffer *evt) >>> evt->flags |= DWC3_EVENT_PENDING; >>> >>> /* Mask interrupt */ >>> - reg = dwc3_readl(dwc->regs, DWC3_GEVNTSIZ(0)); >>> reg |= DWC3_GEVNTSIZ_INTMASK; >>> dwc3_writel(dwc->regs, DWC3_GEVNTSIZ(0), reg); >>> >>> -- >>> 2.11.0 >>> >> This seems like a workaround, while when we mask interrupts we should >> not get IRQ before BH will done. Current driver implementation base on >> this. >> We are using 2.60a and didn't see problem you describe. Is it specyfic >> for some HW revision? > > Doesn't seem to be. We can try other HW versions. > >> >> I can imagine we can have the "last" interrupt and you will return >> IRQ_HANDLED. Will HW issue this interrupt once again? If not will we >> loose this IRQ (eg. disconnect event)? >> > > Yes it will re-assert. The interrupt line will remain asserted as long > events remain in the buffer and it is not masked. So when we > eventually unmask, the interrupt will be reasserted again immediately. > >> BTW, other option here could be using (like Baolin propose some time >> ago) dwc->lock in top half while this is held for BH. >> Question how long spin_lock() will be held in top half... >> I am not sure what option is the best. > > That won't help in this case since you can still have two calls top > top-half before a call to bottom. The lock only prevents the two from > stepping on each other, but that should already be the case without > needing the lock. > Really can we have two TH calls before BH? Interesting case :) You suggest we have something like this: dwc3 IRQ kernel irq_mask dwc3 TH mask dwc3 interrupts get evt->count, evt->cache write to hw (count) return WAKE_THREAD kernel irq_unmask a) Before BH start we get another dwc3 IRQ? b) CPU0 starts with BH while we get dwc3 IRQ on CPU1? *) in normal case BH should start and in the end unmask dwc3_interrupts and after that we could get new irq If this could happen then for sure you need dwc->lock protection in TH while base on your description CPU0 can execute BH and CPU1 can handle TH - so you have to protect dwc3_event_buffer. Could you describe this more? How to reproduce? Do you think we introduce this when adding evt->cache in TH? Even without evt->cache we still could overwrite evt->count - so, was that seen before evt->cache? BR Janusz > Regards, > John > -- > To unsubscribe from this list: send the line "unsubscribe linux-usb" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html