On 19-09-23 16:46:56, Mathias Nyman wrote: > On 23.9.2019 14.19, Mathias Nyman wrote: > > On 19.9.2019 16.59, Suwan Kim wrote: > > > On Thu, Sep 19, 2019 at 05:54:25PM +0800, Peter Chen wrote: > > > > > On 17.9.2019 12.55, Peter Chen wrote: > > > > > > > > > > > > > > > > I met "event ring full error" like below, this error is met when > > > > > > > > I do iozone test on UAS storage at v4.19.35 kernel, but not meet > > > > > > > > this error at linux-next tree (08/24). The same host and test > > > > > > > > UAS storage device are used. This issue is due to xhci_handle_event > > > > > > > > does not return 0 long time, maybe the xHC speed is fast enough > > > > > > > > at that time. If I force the xhci_handle_event only run 100 times > > > > > > > > before update ERST dequene pointer, it will not occur this error. > > > > > > > > I did not see any changes for xhci_handle_event at the latest code, > > > > > > > > so in theory, it should have this issue too. Do you think if we need > > > > > > > > to improve xhci_handle_event to avoid event ring? > > > > > > > > > > > > > The root cause is UAS protocol is very fast > > > > > > protocol, the > > > > > > other threads at non-CPU0 will add TRBs during we are handling event, so if > > > > > > hardware (xHC) has always consumed TD the non-CPU0s are adding, > > > > > > the ERST dequene pointer never get change to update, then this > > > > > > "event ring full" error will occur. > > > > > > > > > > > > The one reason why v4.19 has this issue is the max request length is larger > > > > > > than the latest kernel. At v4.19, it is 512KB, At the latest kernel, > > > > > > it is 256 KB. > > > > > > see /sys/block/sda/queue/max_sectors_kb. > > > > > > When I change max_sectors_kb as smaller value, the test will be more smooth. > > > > > > Besides, At v4.19, the UAS completion handler seems take more time > > > > > > compares to the latest kernel. > > > > > > > > > > > > I suggest adding threshold flag for event ring when it closes to full > > > > > > since we can't > > > > > > avoid USB3 use cases when the throughput is high, but the system is a > > > > > > little slow. > > > > > > Do you agree? > > > > > > > > > > I agree that it makes sense to force a ERDP write after handling some amount > > > > > of events, it can solve some event ring full issues, but not the fact that > > > > > we spend a lot of time in the interrupt handler. > > > > > > > > Ok, I will proposal one patch to fix event ring full issue. > > > > Great > > > > > > > > > > > > > > > > Your logs show that you have TDs containing up to 128 TRBs. > > > > > When a TD like that finishes the driver will increase the sw dequeue pointer of the > > > > > transfer ring one by one until we reach the end of the TD. > > > > > > > > > > This means we call inc_deq() function 128 times in interrupt context, and each time > > > > > do a few comparisons. According to traces this takes ~120us. There might be some > > > > > tracing overhead but this could anyway be done in a saner way. > > > > > > > > > > I'll look into this > > > > > > > > > > > > > Since we use hard irq for xHCI, for high performance protocol, it may hard to > > > > reduce interrupt context time since we have lots of request handling, > > > > cache operation, > > > > and completion are interrupt context. > > > > I'm working on one improvement at the moment, it would be great if you could test > > it out once i get it done. > > Got something done on top of 5.3. > It's in my tree in the irqoff_optimization branch > > git://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git irqoff_optimization > > Does it help at all in your case? > I tested your patch, I am afraid it doesn't help on my case. At my case, the time is consumed at DMA unmap operation and UAS completion, but not xHCI internal code. I have run UAS iozone and iperf tests, it doesn't show error on top of below three patches. usb: host: xhci: update event ring dequeue pointer on purpose usb: host: xhci: Support running urb giveback in tasklet context xhci: remove extra loop in interrupt context -- Thanks, Peter Chen