On Wed, Dec 04, 2024 at 05:45:43PM +0000, Russell King (Oracle) wrote: > On Wed, Dec 04, 2024 at 05:02:19PM +0000, Jon Hunter wrote: > > Hi Russell, > > > > On 04/12/2024 16:39, Russell King (Oracle) wrote: > > > On Wed, Dec 04, 2024 at 04:58:34PM +0100, Thierry Reding wrote: > > > > This doesn't match the location from earlier, but at least there's > > > > something afoot here that needs fixing. I suppose this could simply be > > > > hiding any subsequent errors, so once this is fixed we might see other > > > > similar issues. > > > > > > Well, having a quick look at this, the first thing which stands out is: > > > > > > In stmmac_tx_clean(), we have: > > > > > > if (likely(tx_q->tx_skbuff_dma[entry].buf && > > > tx_q->tx_skbuff_dma[entry].buf_type != STMMAC_TXBUF_T > > > _XDP_TX)) { > > > if (tx_q->tx_skbuff_dma[entry].map_as_page) > > > dma_unmap_page(priv->device, > > > tx_q->tx_skbuff_dma[entry].buf, > > > tx_q->tx_skbuff_dma[entry].len, > > > DMA_TO_DEVICE); > > > else > > > dma_unmap_single(priv->device, > > > tx_q->tx_skbuff_dma[entry].buf, > > > tx_q->tx_skbuff_dma[entry].len, > > > DMA_TO_DEVICE); > > > tx_q->tx_skbuff_dma[entry].buf = 0; > > > tx_q->tx_skbuff_dma[entry].len = 0; > > > tx_q->tx_skbuff_dma[entry].map_as_page = false; > > > } > > > > > > So, tx_skbuff_dma[entry].buf is expected to point appropriately to the > > > DMA region. > > > > > > Now if we look at stmmac_tso_xmit(): > > > > > > des = dma_map_single(priv->device, skb->data, skb_headlen(skb), > > > DMA_TO_DEVICE); > > > if (dma_mapping_error(priv->device, des)) > > > goto dma_map_err; > > > > > > if (priv->dma_cap.addr64 <= 32) { > > > ... > > > } else { > > > ... > > > des += proto_hdr_len; > > > ... > > > } > > > > > > tx_q->tx_skbuff_dma[tx_q->cur_tx].buf = des; > > > tx_q->tx_skbuff_dma[tx_q->cur_tx].len = skb_headlen(skb); > > > tx_q->tx_skbuff_dma[tx_q->cur_tx].map_as_page = false; > > > tx_q->tx_skbuff_dma[tx_q->cur_tx].buf_type = STMMAC_TXBUF_T_SKB; > > > > > > This will result in stmmac_tx_clean() calling dma_unmap_single() using > > > "des" and "skb_headlen(skb)" as the buffer start and length. > > > > > > One of the requirements of the DMA mapping API is that the DMA handle > > > returned by the map operation will be passed into the unmap function. > > > Not something that was offset. The length will also be the same. > > > > > > We can clearly see above that there is a case where the DMA handle has > > > been offset by proto_hdr_len, and when this is so, the value that is > > > passed into the unmap operation no longer matches this requirement. > > > > > > So, a question to the reporter - what is the value of > > > priv->dma_cap.addr64 in your failing case? You should see the value > > > in the "Using %d/%d bits DMA host/device width" kernel message. > > > > It is ... > > > > dwc-eth-dwmac 2490000.ethernet: Using 40/40 bits DMA host/device width > > So yes, "des" is being offset, which will upset the unmap operation. > Please try the following patch, thanks: > > diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c > index 9b262cdad60b..c81ea8cdfe6e 100644 > --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c > +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c > @@ -4192,8 +4192,8 @@ static netdev_tx_t stmmac_tso_xmit(struct sk_buff *skb, struct net_device *dev) > struct stmmac_txq_stats *txq_stats; > struct stmmac_tx_queue *tx_q; > u32 pay_len, mss, queue; > + dma_addr_t tso_des, des; > u8 proto_hdr_len, hdr; > - dma_addr_t des; > bool set_ic; > int i; > > @@ -4289,14 +4289,15 @@ static netdev_tx_t stmmac_tso_xmit(struct sk_buff *skb, struct net_device *dev) > > /* If needed take extra descriptors to fill the remaining payload */ > tmp_pay_len = pay_len - TSO_MAX_BUFF_SIZE; > + tso_des = des; > } else { > stmmac_set_desc_addr(priv, first, des); > tmp_pay_len = pay_len; > - des += proto_hdr_len; > + tso_des = des + proto_hdr_len; > pay_len = 0; > } > > - stmmac_tso_allocator(priv, des, tmp_pay_len, (nfrags == 0), queue); > + stmmac_tso_allocator(priv, tso_des, tmp_pay_len, (nfrags == 0), queue); > > /* In case two or more DMA transmit descriptors are allocated for this > * non-paged SKB data, the DMA buffer address should be saved to I see, that makes sense. Looks like this has been broken for a few years (since commit 34c15202896d ("net: stmmac: Fix the problem of tso_xmit")) and Furong's patch ended up exposing it. Anyway, this seems to fix it for me. I can usually trigger the issue within one or two iperf runs, with your patch I haven't seen it break after a dozen or so runs. It may be good to have Jon's test results as well, but looks good so far. Thanks! Thierry
Attachment:
signature.asc
Description: PGP signature