Hi Andrea, On 11/28/23 13:00, Andrea della Porta wrote: > Hi Minas, > > On 10:46 Mon 27 Nov , Minas Harutyunyan wrote: >> Hi Andrea, >> >> On 11/27/23 13:04, Andrea della Porta wrote: >>> On 12:35 Wed 22 Nov , Minas Harutyunyan wrote: >>>> Hi Oliver, >>>> >>>> On 11/15/23 18:45, Oliver Neukum wrote: >>>>> dwc2_hc_n_intr() writes back INTMASK as read but evaluates it >>>>> with intmask applied. In stress testing this causes spurious >>>>> interrupts like this: >>>>> >>>>> [Mon Aug 14 10:51:07 2023] dwc2 3f980000.usb: dwc2_hc_chhltd_intr_dma: Channel 7 - ChHltd set, but reason is unknown >>>>> [Mon Aug 14 10:51:07 2023] dwc2 3f980000.usb: hcint 0x00000002, intsts 0x04600001 >>>>> [Mon Aug 14 10:51:08 2023] dwc2 3f980000.usb: dwc2_hc_chhltd_intr_dma: Channel 0 - ChHltd set, but reason is unknown >>>>> [Mon Aug 14 10:51:08 2023] dwc2 3f980000.usb: hcint 0x00000002, intsts 0x04600001 >>>>> [Mon Aug 14 10:51:08 2023] dwc2 3f980000.usb: dwc2_hc_chhltd_intr_dma: Channel 4 - ChHltd set, but reason is unknown >>>>> [Mon Aug 14 10:51:08 2023] dwc2 3f980000.usb: hcint 0x00000002, intsts 0x04600001 >>>>> [Mon Aug 14 10:51:08 2023] dwc2 3f980000.usb: dwc2_update_urb_state_abn(): trimming xfer length >>>>> >>>>> Applying INTMASK prevents this. The issue exists in all versions of the >>>>> driver. >>>> >>>> I'm Ok with this patch, just need some elaboration. >>>> So, channel halted interrupt asserted due to some other interrupt (AHB >>>> Error, Excessive transaction errors, Babble, Stall) which was masked. >>>> Can you check which of masked interrupts is cause of channel halt interrupt? >>>> >>>> Thanks, >>>> Minas >>>> >>> >>> Hi Minas, >>> here's the report from dmesg: >>> >>> [209384.238724] hcint 0x00000012, hcintmsk 0x00000006, hcint&hcintmsk 0x00000002 >>> [209384.246725] hcint 0x00000012, hcintmsk 0x00000006, hcint&hcintmsk 0x00000002 >>> [209384.247634] hcint 0x00000003, hcintmsk 0x00000006, hcint&hcintmsk 0x00000002 >>> [209384.254722] hcint 0x00000012, hcintmsk 0x00000006, hcint&hcintmsk 0x00000002 >>> [209384.262725] hcint 0x00000012, hcintmsk 0x00000006, hcint&hcintmsk 0x00000002 >>> [209384.270724] hcint 0x00000012, hcintmsk 0x00000006, hcint&hcintmsk 0x00000002 >>> [209384.278336] hcint 0x00000092, hcintmsk 0x00000006, hcint&hcintmsk 0x00000002 >>> [209384.278384] hcint 0x00000010, hcintmsk 0x00000436, hcint&hcintmsk 0x00000010 >>> [209384.278720] hcint 0x00000012, hcintmsk 0x00000006, hcint&hcintmsk 0x00000002 >>> [209384.286723] hcint 0x00000012, hcintmsk 0x00000006, hcint&hcintmsk 0x00000002 >>> [209384.288486] hcint 0x00000021, hcintmsk 0x00000426, hcint&hcintmsk 0x00000020 >> >>> [209384.288511] hcint 0x00000002, hcintmsk 0x00000406, hcint&hcintmsk 0x00000002 >> According your patch terminology above 'hcint' is a 'hcintraw' and here >> asserted only channel halted interrupt. So, channel halted without any >> reason. Not clear how your patch fix this case? >> >> Couple of questions related to your tests: >> 1. Which DMA mode used here - buffer or descriptor DMA? >> 2. Which type of transfers testing? Or you can add to this print HCCHARx >> also. >> 3. Which version of core you use (GSNPSID)? >> >> Thanks, >> Minas >> > > The augmented log is here below (hcchar[chnum] and others added): > > [330438.623017] --Host Channel Interrupt--, Channel 4 > [330438.623029] hcint 0x00000012, hcintmsk 0x00000006, hcint&hcintmsk 0x00000002 > [330438.623039] hcchar[4] = 0x015c9810, chan->ep_type=3 > [330438.626183] --Host Channel Interrupt--, Channel 5 > [330438.626198] hcint 0x00000003, hcintmsk 0x00000006, hcint&hcintmsk 0x00000002 > [330438.626208] hcchar[5] = 0x01d83200, chan->ep_type=2 > [330438.627659] --Host Channel Interrupt--, Channel 3 > [330438.627674] hcint 0x00000021, hcintmsk 0x00000426, hcint&hcintmsk 0x00000020 > [330438.627686] hcchar[3] = 0x01d8d200, chan->ep_type=2 > [330438.627703] --Host Channel Interrupt--, Channel 3 > [330438.627711] hcint 0x00000002, hcintmsk 0x00000406, hcint&hcintmsk 0x00000002 > [330438.627721] hcchar[3] = 0x01d8d200, chan->ep_type=2 > [330438.627740] dwc2 3f980000.usb: dwc2_hc_chhltd_intr_dma: Channel 3 - ChHltd set, but reason is unknown > [330438.627758] dwc2 3f980000.usb: hcint 0x00000002, intsts 0x04600001 > [330438.627798] --Host Channel Interrupt--, Channel 0 > [330438.627807] hcint 0x00000010, hcintmsk 0x00000436, hcint&hcintmsk 0x00000010 > [330438.627818] hcchar[0] = 0x81d8d200, chan->ep_type=2 > [330438.631017] --Host Channel Interrupt--, Channel 1 > [330438.631025] hcint 0x00000012, hcintmsk 0x00000006, hcint&hcintmsk 0x00000002 > [330438.631035] hcchar[1] = 0x015c9810, chan->ep_type=3 > [330438.639019] --Host Channel Interrupt--, Channel 7 > [330438.639041] hcint 0x00000012, hcintmsk 0x00000006, hcint&hcintmsk 0x00000002 > [330438.639053] hcchar[7] = 0x015c9810, chan->ep_type=3 > [330438.647019] --Host Channel Interrupt--, Channel 6 > [330438.647040] hcint 0x00000012, hcintmsk 0x00000006, hcint&hcintmsk 0x00000002 > [330438.647051] hcchar[6] = 0x015c9810, chan->ep_type=3 > [330438.649015] --Host Channel Interrupt--, Channel 4 > [330438.649020] hcint 0x00000012, hcintmsk 0x00000006, hcint&hcintmsk 0x00000002 > > we're using buffer DMA mode (dma_desc_enable=0), we are atrss testing via ping flooding > through an LTE modem attached to USB which should mostly do bulk dataflow (confirmed > by ep_type==2). From regdump, GSNPSID = 0x4f54280a. > Host channel halt interrupt with "unkown reason" (no any other interrupt seen in hcintraw) mean that channel halted because application disable channel, i.e. on urb_dequeue. Maybe "ChHltd set, but reason is unknown" message is not correct, but it's possible than channel halted interrupt asserted not because of any error condition. Channel halted interrupt asserted by databook in follow cases: "In non Scatter/Gather DMA mode, it indicates the transfer completed abnormally either because of any USB transaction error or in response to disable request by the application or because of a completed transfer." So, it's not spurious interrupt. Does this "spurious" interrupt broke your tests? If not, then doesn't need to apply your patch. Moreover, I can't understand how your patch allow to avoid case with "ChHltd set, but reason is unknown" message. Thanks, Minas > Many thanks, > Andrea > >>>>> >>>>> Signed-off-by: Oliver Neukum <oneukum@xxxxxxxx> >>>>> Tested-by: Ivan Ivanov <ivan.ivanov@xxxxxxxx> >>>>> Tested-by: Andrea della Porta <andrea.porta@xxxxxxxx> >>>>> --- >>>>> drivers/usb/dwc2/hcd_intr.c | 15 +++++++-------- >>>>> 1 file changed, 7 insertions(+), 8 deletions(-) >>>>> >>>>> diff --git a/drivers/usb/dwc2/hcd_intr.c b/drivers/usb/dwc2/hcd_intr.c >>>>> index 0144ca8350c3..5c7538d498dd 100644 >>>>> --- a/drivers/usb/dwc2/hcd_intr.c >>>>> +++ b/drivers/usb/dwc2/hcd_intr.c >>>>> @@ -2015,15 +2015,17 @@ static void dwc2_hc_n_intr(struct dwc2_hsotg *hsotg, int chnum) >>>>> { >>>>> struct dwc2_qtd *qtd; >>>>> struct dwc2_host_chan *chan; >>>>> - u32 hcint, hcintmsk; >>>>> + u32 hcint, hcintraw, hcintmsk; >>>>> >>>>> chan = hsotg->hc_ptr_array[chnum]; >>>>> >>>>> - hcint = dwc2_readl(hsotg, HCINT(chnum)); >>>>> + hcintraw = dwc2_readl(hsotg, HCINT(chnum)); >>>>> hcintmsk = dwc2_readl(hsotg, HCINTMSK(chnum)); >>>>> + hcint = hcintraw & hcintmsk; >>>>> + dwc2_writel(hsotg, hcint, HCINT(chnum)); >>>>> + >>>>> if (!chan) { >>>>> dev_err(hsotg->dev, "## hc_ptr_array for channel is NULL ##\n"); >>>>> - dwc2_writel(hsotg, hcint, HCINT(chnum)); >>>>> return; >>>>> } >>>>> >>>>> @@ -2032,11 +2034,9 @@ static void dwc2_hc_n_intr(struct dwc2_hsotg *hsotg, int chnum) >>>>> chnum); >>>>> dev_vdbg(hsotg->dev, >>>>> " hcint 0x%08x, hcintmsk 0x%08x, hcint&hcintmsk 0x%08x\n", >>>>> - hcint, hcintmsk, hcint & hcintmsk); >>>>> + hcintraw, hcintmsk, hcint); >>>>> } >>>>> >>>>> - dwc2_writel(hsotg, hcint, HCINT(chnum)); >>>>> - >>>>> /* >>>>> * If we got an interrupt after someone called >>>>> * dwc2_hcd_endpoint_disable() we don't want to crash below >>>>> @@ -2046,8 +2046,7 @@ static void dwc2_hc_n_intr(struct dwc2_hsotg *hsotg, int chnum) >>>>> return; >>>>> } >>>>> >>>>> - chan->hcint = hcint; >>>>> - hcint &= hcintmsk; >>>>> + chan->hcint = hcintraw; >>>>> >>>>> /* >>>>> * If the channel was halted due to a dequeue, the qtd list might