On Fri, Sep 29, 2023 at 04:11:15PM +0000, Haiyang Zhang wrote: ... > > > @@ -209,19 +281,6 @@ netdev_tx_t mana_start_xmit(struct sk_buff *skb, > > struct net_device *ndev) > > > pkg.wqe_req.client_data_unit = 0; > > > > > > pkg.wqe_req.num_sge = 1 + skb_shinfo(skb)->nr_frags; > > > - WARN_ON_ONCE(pkg.wqe_req.num_sge > > > MAX_TX_WQE_SGL_ENTRIES); > > > - > > > - if (pkg.wqe_req.num_sge <= ARRAY_SIZE(pkg.sgl_array)) { > > > - pkg.wqe_req.sgl = pkg.sgl_array; > > > - } else { > > > - pkg.sgl_ptr = kmalloc_array(pkg.wqe_req.num_sge, > > > - sizeof(struct gdma_sge), > > > - GFP_ATOMIC); > > > - if (!pkg.sgl_ptr) > > > - goto tx_drop_count; > > > - > > > - pkg.wqe_req.sgl = pkg.sgl_ptr; > > > - } > > > > It is unclear to me why this logic has moved from here to further > > down in this function. Is it to avoid some cases where > > alloation has to be unwond on error (when mana_fix_skb_head() fails) ? > > If so, this feels more like an optimisation than a fix. > mana_fix_skb_head() may add one more sge (success case) so the sgl > allocation should be done later. Otherwise, we need to free / re-allocate > the array later. Understood, thanks for the clarification. > > > if (skb->protocol == htons(ETH_P_IP)) > > > ipv4 = true; > > > @@ -229,6 +288,23 @@ netdev_tx_t mana_start_xmit(struct sk_buff *skb, > > struct net_device *ndev) > > > ipv6 = true; > > > > > > if (skb_is_gso(skb)) { > > > + gso_hs = mana_get_gso_hs(skb); > > > + > > > + if (mana_fix_skb_head(ndev, skb, gso_hs, > > &pkg.wqe_req.num_sge)) > > > + goto tx_drop_count; > > > + > > > + if (skb->encapsulation) { > > > + u64_stats_update_begin(&tx_stats->syncp); > > > + tx_stats->tso_inner_packets++; > > > + tx_stats->tso_inner_bytes += skb->len - gso_hs; > > > + u64_stats_update_end(&tx_stats->syncp); > > > + } else { > > > + u64_stats_update_begin(&tx_stats->syncp); > > > + tx_stats->tso_packets++; > > > + tx_stats->tso_bytes += skb->len - gso_hs; > > > + u64_stats_update_end(&tx_stats->syncp); > > > + } > > > > nit: I wonder if this could be slightly more succinctly written as: > > > > u64_stats_update_begin(&tx_stats->syncp); > > if (skb->encapsulation) { > > tx_stats->tso_inner_packets++; > > tx_stats->tso_inner_bytes += skb->len - gso_hs; > > } else { > > tx_stats->tso_packets++; > > tx_stats->tso_bytes += skb->len - gso_hs; > > } > > u64_stats_update_end(&tx_stats->syncp); > > > Yes it can be written this way:) > > > Also, it is unclear to me why the stats logic is moved here from > > futher down in the same block. It feels more like a clean-up than a fix > > (as, btw, is my suggestion immediately above). > Since we need to calculate the gso_hs and fix head earlier than the stats and > some other work, I move it immediately after skb_is_gso(skb). > The gso_hs calculation was part of the tx_stats block, so the tx_stats is moved > together to remain close to the gso_hs calculation to keep readability. I agree it is nice the way you have it. I was mainly thinking that the diffstat could be made smaller, which might be beneficial to a fix. But I have no strong feelings on that. > > > + > > > pkg.tx_oob.s_oob.is_outer_ipv4 = ipv4; > > > pkg.tx_oob.s_oob.is_outer_ipv6 = ipv6; > > > ...