<snip> > Minor nit: please use the reverse x-mas tree order Sure, will fix it in next version. <snip> > You can avoid some duplicate code with: > goto drop_skb; > > and adding at the bottom of this function: > > drop_skb: > dev_kfree_skb_any(skb); > return NETDEV_TX_OK; > Agree , will switch to it in next version. <snip> > You forgot to add the netif_txq_maybe_stop() call, as suggested by Jakub in > the previous revision. I was in an impression that these are multi queue specific APIs, so I skipped them. But revisited the implementation and it seems clear now, and modified the driver to use these lockless queue stopping / waking helpers. + tax = skb_get_tx_queue(lp->ndev, skb); + netdev_tx_sent_queue(txq, skb->len); + netif_txq_maybe_stop(txq, CIRC_SPACE(lp->tx_ring_head, lp->tx_ring_tail, + TX_BD_NUM_MAX), MAX_SKB_FRAGS + 1, 2 * MAX_SKB_FRAGS); However, in netperf benchmark (TCP TX) I am seeing a dip in performance (~35-40Mbps) when switching to these stop/wake helpers. Is it expected considering extra logic in maintaining dynamic queue and these helpers? Also, in throughput benchmarking there was no occurrence when the queue was stopped/woken up. Throughput: (10^6bits/sec) 915.55 (v8 - without using lockless queue stop/wake helpers) ====== Switch to lockless queue stop/wake helpers # netperf -H 192.168.10.20 -t TCP_STREAM MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.10.20 () port 0 AF_INET Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 131072 16384 16384 10.02 876.94 > > > + > > + skbuf_dma->skb = skb; > > + skbuf_dma->sg_len = sg_len; > > + dma_tx_desc->callback_param = lp; > > + dma_tx_desc->callback_result = axienet_dma_tx_cb; > > + dmaengine_submit(dma_tx_desc); > > + dma_async_issue_pending(lp->tx_chan); > > + > > + return NETDEV_TX_OK; > > + > > +xmit_error_unmap_sg: > > + dma_unmap_sg(lp->dev, skbuf_dma->sgl, sg_len, DMA_TO_DEVICE); > > If you need to drop the skb (as I suspect), you can reuse the drop_skb label > here: > > drop_skb: > dev_kfree_skb_any(skb); > > + return NETDEV_TX_OK; Yes , will make this change and this will also fix skb leak. > > +} > > + > > /** > > * axienet_tx_poll - Invoked once a transmit is completed by the > > * Axi DMA Tx channel. > > @@ -893,6 +1028,42 @@ axienet_start_xmit(struct sk_buff *skb, struct > net_device *ndev) > > return NETDEV_TX_OK; > > } > > > > +/** > > + * axienet_dma_rx_cb - DMA engine callback for RX channel. > > + * @data: Pointer to the skbuf_dma_descriptor structure. > > + * @result: error reporting through dmaengine_result. > > + * This function is called by dmaengine driver for RX channel to > > +notify > > + * that the packet is received. > > + */ > > +static void axienet_dma_rx_cb(void *data, const struct > > +dmaengine_result *result) { > > + struct axienet_local *lp = data; > > + struct skbuf_dma_descriptor *skbuf_dma; > > + size_t meta_len, meta_max_len, rx_len; > > + struct sk_buff *skb; > > + u32 *app_metadata; > > Minor nit: please respect the reverse x-mas tree order Yes, will fix it in next version. > > > + > > + skbuf_dma = axienet_get_rx_desc(lp, lp->rx_ring_tail++); > > + skb = skbuf_dma->skb; > > + app_metadata = dmaengine_desc_get_metadata_ptr(skbuf_dma- > >desc, &meta_len, > > + &meta_max_len); > > + dma_unmap_single(lp->dev, skbuf_dma->dma_address, lp- > >max_frm_size, > > + DMA_FROM_DEVICE); > > + /* TODO: Derive app word index programmatically */ > > + rx_len = (app_metadata[LEN_APP] & 0xFFFF); > > + skb_put(skb, rx_len); > > + skb->protocol = eth_type_trans(skb, lp->ndev); > > + skb->ip_summed = CHECKSUM_NONE; > > + > > + __netif_rx(skb); > > It's a pity you can't leverage NAPI here. > > I think that could be doable as a follow-up, but I'm unsure if that would fit > the DMA engine model: in this callback you could cache the ready dma index > (a single range should suffice) and schedule the napi instance. The actual > dma processing will be done in napi poll. > > Another possible follow-up could be introducing a "bulk" RX callback in the > DMA engine, to mitigate the indirect call overhead on a burst of RX DMA > completion - assuming the DMA engine actually generates such burst. Agree , these are possible thoughts and will start working on it once this baseline dmaengine support series is done. > > > + u64_stats_update_begin(&lp->rx_stat_sync); > > + u64_stats_add(&lp->rx_packets, 1); > > + u64_stats_add(&lp->rx_bytes, rx_len); > > + u64_stats_update_end(&lp->rx_stat_sync); > > + axienet_rx_submit_desc(lp->ndev); > > + dma_async_issue_pending(lp->rx_chan); > > +} > > + > > /** > > * axienet_rx_poll - Triggered by RX ISR to complete the BD processing. > > * @napi: Pointer to NAPI structure. > > @@ -1126,6 +1297,150 @@ static irqreturn_t axienet_eth_irq(int irq, > > void *_ndev) > > > > static void axienet_dma_err_handler(struct work_struct *work); > > > > +/** > > + * axienet_rx_submit_desc - Submit the rx descriptors to dmaengine. > > + * allocate skbuff, map the scatterlist and obtain a descriptor > > + * and then add the callback information and submit descriptor. > > + * > > + * @ndev: net_device pointer > > + * > > + *Return: 0, on success. > > + * non-zero error value on failure > > + */ > > +static int axienet_rx_submit_desc(struct net_device *ndev) { > > + struct dma_async_tx_descriptor *dma_rx_desc = NULL; > > + struct axienet_local *lp = netdev_priv(ndev); > > + struct skbuf_dma_descriptor *skbuf_dma; > > + struct sk_buff *skb; > > + dma_addr_t addr; > > + int ret; > > + > > + skbuf_dma = axienet_get_rx_desc(lp, lp->rx_ring_head); > > + if (!skbuf_dma) > > + return -ENOMEM; > > Minor nit: here a newline would make the core more readable Accepted , will add in next version. > > > + lp->rx_ring_head++; > > + skb = netdev_alloc_skb(ndev, lp->max_frm_size); > > + if (!skb) > > + return -ENOMEM; > > Another possible follow-up: usually the skb header is allocated just before > sending it to the network stack (e.g. just before the > __netif_rx() call) to be cache friendly. Here you could allocate just the data > part and later use e.g. build_skb_around() Sure, will explore on it. > > > + > > + sg_init_table(skbuf_dma->sgl, 1); > > + addr = dma_map_single(lp->dev, skb->data, lp->max_frm_size, > DMA_FROM_DEVICE); > > + if (unlikely(dma_mapping_error(lp->dev, addr))) { > > + if (net_ratelimit()) > > + netdev_err(ndev, "DMA mapping error\n"); > > + ret = -ENOMEM; > > + goto rx_submit_err_free_skb; > > + } > > + sg_dma_address(skbuf_dma->sgl) = addr; > > + sg_dma_len(skbuf_dma->sgl) = lp->max_frm_size; > > + dma_rx_desc = dmaengine_prep_slave_sg(lp->rx_chan, skbuf_dma- > >sgl, > > + 1, DMA_DEV_TO_MEM, > > + DMA_PREP_INTERRUPT); > > + if (!dma_rx_desc) { > > + ret = -EINVAL; > > + goto rx_submit_err_unmap_skb; > > + } > > + > > + skbuf_dma->skb = skb; > > + skbuf_dma->dma_address = sg_dma_address(skbuf_dma->sgl); > > + skbuf_dma->desc = dma_rx_desc; > > + dma_rx_desc->callback_param = lp; > > + dma_rx_desc->callback_result = axienet_dma_rx_cb; > > + dmaengine_submit(dma_rx_desc); > > + > > + return 0; > > + > > +rx_submit_err_unmap_skb: > > + dma_unmap_single(lp->dev, addr, lp->max_frm_size, > DMA_FROM_DEVICE); > > +rx_submit_err_free_skb: > > + dev_kfree_skb(skb); > > + return ret; > > It looks like the error code is ignored by the caller. Possibly you can change > this to a 'void' function. will make it void in next version. Thanks, Radhey