Hi, Mathias Nyman <mathias.nyman@xxxxxxxxxxxxxxx> 于2019年9月26日周四 下午4:19写道: > > On 26.9.2019 8.45, Felipe Balbi wrote: > > > > Hi, > > > > David Laight <David.Laight@xxxxxxxxxx> writes: > >> From: Mathias Nyman > >>> Sent: 25 September 2019 15:48 > >>> > >>> On 24.9.2019 17.45, alex zheng wrote: > >>>> Hi Mathias, > >> ... > >>> Logs show your transfer ring has four segments, but hardware fails to > >>> jump from the last segment back to first) > >>> > >>> Last TRB (LINK TRB) of each segment points to the next segment, > >>> last segments link trb points back to first segment. > >>> > >>> In your case: > >>> 0x1d117000 -> 0x1eb09000 -> 0x1eb0a000 -> 0x1dbda000 -> (back to 0x1d117000) > >>> > >>> For some reason your hardware doesn't treat the last TRB at the last segment > >>> as a LINK TRB, instead it just issues a transfer event for it, and continues to > >>> the next address instead of jumping back to first segment: > >> > >> That could be a cache coherency (or flushing (etc)) issue. > > The Link TRB is written very early, right after the ring segment is allocated, > and before any other TRBs. 255 other TRBs were written and handled by hw > on this segment after this, so not very likely a flushing/cache coherency issue. > I add a flush_cache_all() after queue_trb everytime but it make no use. It seems not a flushing/cache coherency issus. flush like this: inc_enq(xhci, ring, more_trbs_coming); + flush_cache_all(); > > > > XHCI has a HW-configurable maximum number of segments in a ring. AFAICT, > > xhci driver doesn't take that into consideration today. Perhaps the HW > > in question doesn't like more than 3 segments. > > > > Mathias, what was the register to check this? Do you remember? > > > > I only recall a limit for the event ring in the HSCPARAMS2 register(ERST MAX), > not for transfer rings. > > Other things to look at would be > > - check that Toggle Cycle bit is correct for last segments link TRB (incomplete logs) I dump an other error log, more complete logs see attached file(transfer_error_0926.cap), in the log: the error link TRB: 0x1d00dff0: TRB 000000001d068000 status 'Invalid' len 0 slot 0 ep 0 type 'Link' flags e:c and last segment link TRB: 0x1eb0aff0: TRB 000000001d00d000 status 'Invalid' len 0 slot 0 ep 0 type 'Link' flags e:C > - some old xHCI hardware needed the Chain bit set in link TRB for some isoc rings xhci ver is 1.1: 6.888570] c1 46 (kworker/u8:1) xhci-hcd xhci-hcd.0.auto: HCIVERSION: 0x110 > - was ring recently expanded?, usually rings start with only two segments The extra segments are expanded after raw data test run a while, especially when the RNDIS test(iperf3) begin to run. Other info: 1. This issue seems only happened when the raw bulk data test and the rndis test(other pair endpoints) run at the same time, and happens more often if we queue trb more quick. 2. The raw bulk data test case is a libusb test use ep4(in) & ep3(out) to transfer raw bulk data, and I use iperf3(tcp) to test USB rndis. 3. The log file attached only show ep4(in) enqueue/dequeue log for more readable, 4. More test result show as below: 1) run just one raw bulk data test --> (always fine) 2) run raw rulk data test + rndis test run at the same time --> (transfer error in 10 minutes) 3) run two raw bulk data test run at the same time (with two pair endpoint) --> (transfer error in 10 minutes) 5. I try to modify the DWC3 hw registers like TX/RX FIFO size, GTXTHRCFG/GRXTHRCFG , but also did not work. 6. Related interface info: 8801 I:* If#= 0 Alt= 0 #EPs= 1 Cls=e0(wlcon) Sub=01 Prot=03 Driver=rndis_host 8802 E: Ad=82(I) Atr=03(Int.) MxPS= 8 Ivl=32ms 8803 I:* If#= 1 Alt= 0 #EPs= 2 Cls=0a(data ) Sub=00 Prot=00 Driver=rndis_host -----> used in rndis test 8804 E: Ad=81(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms 8805 E: Ad=01(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms 8809 I:* If#= 3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=43 Prot=01 Driver=(none) -----> used in raw bulk test 8810 E: Ad=03(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms 8811 E: Ad=84(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms 8820 I:* If#= 7 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=43 Prot=01 Driver=(none) ----> used in double raw bulk test 8821 E: Ad=06(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms 8822 E: Ad=88(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms It seems that there are some conflicts when multiple endpoints work at the same time on our SOC. Are there any other way can try? > > Mathias