On 04.06.2018 18:28, Sudip Mukherjee wrote:
On Thu, May 24, 2018 at 04:35:34PM +0300, Mathias Nyman wrote:
Log show two rings having the same TRB segment dma address, this will completely mess up the transfer:
While allocating rigs the enque pointers for the two rings are the same:
461.859315: xhci_ring_alloc: ISOC efa4e580: enq 0x0000000033386000(0x0000000033386000) deq 0x0000000033386000(0x0000000033386000) segs 2 stream 0 ...bs
461.859320: xhci_ring_alloc: ISOC f0ce1f00: enq 0x0000000033386000(0x0000000033386000) deq 0x0000000033386000(0x0000000033386000) segs 2 stream 0 ...
So something goes really wrong when allocating or setting up the rings in one of these functions:
To verify and rule out dma_pool_zalloc(), could you apply the attached patch and reproduce with new logs?
I spoke too soon in my yesterday's mail. We were able to reproduce it
on the automated tests. The log and the trace is at:
https://drive.google.com/open?id=1h-3r-1lfjg8oblBGkzdRIq8z3ZNgGZx-
Will request you to have a look at it.
Odd and unlikely, but to me this looks like some issue in allocating dma memory
from pool using dma_pool_zalloc()
Adding people with DMA knowledge to cc, maybe someone knows what is going on.
Here's the story:
Sudip sees usb issues on a Intel Atom based board with 4.14.2 kernel.
All tracing points to dma_pool_zalloc() returning the same dma address block on
consecutive calls.
In the failing case dma_pool_zalloc() is called 3 - 6us apart.
<...>-26362 [002] .... 1186.756739: xhci_ring_mem_detail: MATTU xhci_segment_alloc dma @ 0x000000002d92b000
<...>-26362 [002] .... 1186.756745: xhci_ring_mem_detail: MATTU xhci_segment_alloc dma @ 0x000000002d92b000
<...>-26362 [002] .... 1186.756748: xhci_ring_mem_detail: MATTU xhci_segment_alloc dma @ 0x000000002d92b000
dma_pool_zalloc() is called from xhci_segment_alloc() in drivers/usb/host/xhci-mem.c
see:
https://elixir.bootlin.com/linux/v4.14.2/source/drivers/usb/host/xhci-mem.c#L52
prints above are custom traces added right after dma_pool_zalloc()
@@ -44,10 +44,15 @@ static struct xhci_segment *xhci_segment_alloc(struct xhci_hcd *xhci,
return NULL;
}
+ xhci_dbg_trace(xhci, trace_xhci_ring_mem_detail,
+ "MATTU xhci_segment_alloc dma @ %pad", &dma);
+
Any idea what's going on?
dma_pool_alloc() has a comment that it drops &pool->lock if it needs to allocate
a page, can it be related?
Thanks
-Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html