Hi Paul, Thanks for your feedback. On 2024-06-13 13:29:30 +0100, Paul Barker wrote: > On 13/06/2024 11:41, Niklas Söderlund wrote: > > Add initial support for Renesas Ethernet-TSN End-station device of R-Car > > V4H. The Ethernet End-station can connect to an Ethernet network using a > > 10 Mbps, 100 Mbps, or 1 Gbps full-duplex link via MII/GMII/RMII/RGMII. > > Depending on the connected PHY. > > > > The driver supports Rx checksum and offload and hardware timestamps. > > > > While full power management and suspend/resume is not yet supported the > > driver enables runtime PM in order to enable the module clock. While > > explicit clock management using clk_enable() would suffice for the > > supported SoC, the module could be reused on SoCs where the module is > > part of a power domain. > > > > Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@xxxxxxxxxxxx> > > --- > > * Changes since v7 > > - Properly handle Rx when netpoll is called with 0 budget. Previously > > one Rx packet would be processed in this case. > > - Use napi_alloc_skb() instead of netdev_alloc_skb() in Rx. > > - Use device implemented 64 bit stats instead of using stats member in > > struct net_device. > > - Implement ndo_hwtstamp_{g,s}et instead of acting on SIOC{G,S}HWTSTAMP > > in do_ioctl(). > > - Fix incorrect error sign when using PTR_ERR(). > > - Add missing cosmetic space when creating struct of_device_id. > > - Rename label error_alloc to error_free. > > > > * Changes since v6 > > - Fix warning added when removing delays depending on phy-mode logic. > > > > * Changes since v5 > > - Removed delays depending on phy-mode logic. This is only needed on > > some SoCs (V4H multiple boards), and currently we can't test on them. > > As this have been a hot topic in review drop it for now so we can > > support V4H single boards which we can test and lets work on the delay > > on-top when we can test it. > > > > * Changes since v4 > > - Enable GPOUT_RDM and GPOUT_TDM delays depending on phy-mode. > > > > * Changes since v3 > > - Add description to commit message why PM operations are used instead > > of explicit management of the clock. > > > > * Changes since v2 > > - Drop extra call to ether_setup(), already called by > > alloc_etherdev_mqs(). > > - Remove dependency on MII. > > - Improve error paths when requestion IRQs. > > - Correct the interpretation of which phy-mode to apply delays for, and > > mask the phy-mode passed to the PHY if MAC adds delays. > > > > * Changes since v1 > > - Remove C45 read/write functions, the phy-core can do that for us. > > - Rework the driver to register the MDIO bus at probe time instead of at > > ndo_open(). > > - Rework rtsn_rx() to take advantages of the improved design proposed > > and upstreamed for R-Car RAVB driver Rx code-path. > > - Use napi_complete_done() instead of napi_complete(). > > - Update commit message to list that the driver supports Rx CSUM > > offload. > > - Drop unneeded __iowmb() left from development as well as a stray ;. > > - Consider all RGMII modes. > > - Move the phy_stop() call to mirror where phy_start() is called. > > - Forward IOCTLS from rtsn_do_ioctl() to PHY using ndo_eth_ioctl() > > - Align variable names to match core for IOCTLS functions. > > - Make sure DMA mask and PTP is registered before registering the ndev. > > - Document why the RTSN driver needs to be able to apply delays > > - Add checks for which phy-modes the MAC driver can apply delays > > - Use snprintf() instead of sprintf() > > > > * Changes since RFC > > - Fix issues in MDIO communication. > > - Use a dedicated OF node for the MDIO bus. > > --- > > MAINTAINERS | 8 + > > drivers/net/ethernet/renesas/Kconfig | 10 + > > drivers/net/ethernet/renesas/Makefile | 2 + > > drivers/net/ethernet/renesas/rtsn.c | 1381 +++++++++++++++++++++++++ > > drivers/net/ethernet/renesas/rtsn.h | 464 +++++++++ > > 5 files changed, 1865 insertions(+) > > create mode 100644 drivers/net/ethernet/renesas/rtsn.c > > create mode 100644 drivers/net/ethernet/renesas/rtsn.h > > [snip] > > > +static int rtsn_rx(struct net_device *ndev, int budget) > > +{ > > + struct rtsn_private *priv = netdev_priv(ndev); > > + unsigned int ndescriptors; > > + unsigned int rx_packets; > > + unsigned int i; > > + bool get_ts; > > + > > + get_ts = priv->ptp_priv->tstamp_rx_ctrl & > > + RCAR_GEN4_RXTSTAMP_TYPE_V2_L2_EVENT; > > + > > + ndescriptors = priv->dirty_rx + priv->num_rx_ring - priv->cur_rx; > > + rx_packets = 0; > > + for (i = 0; i < ndescriptors; i++) { > > + const unsigned int entry = priv->cur_rx % priv->num_rx_ring; > > + struct rtsn_ext_ts_desc *desc = &priv->rx_ring[entry]; > > + struct sk_buff *skb; > > + dma_addr_t dma_addr; > > + u16 pkt_len; > > + > > + /* Stop processing descriptors if budget is consumed. */ > > + if (rx_packets >= budget) > > + break; > > + > > + /* Stop processing descriptors on first empty. */ > > + if ((desc->die_dt & DT_MASK) == DT_FEMPTY) > > + break; > > + > > + dma_rmb(); > > + pkt_len = le16_to_cpu(desc->info_ds) & RX_DS; > > + > > + skb = priv->rx_skb[entry]; > > + priv->rx_skb[entry] = NULL; > > + dma_addr = le32_to_cpu(desc->dptr); > > + dma_unmap_single(ndev->dev.parent, dma_addr, PKT_BUF_SZ, > > + DMA_FROM_DEVICE); > > + > > + /* Get timestamp if enabled. */ > > + if (get_ts) { > > + struct skb_shared_hwtstamps *shhwtstamps; > > + struct timespec64 ts; > > + > > + shhwtstamps = skb_hwtstamps(skb); > > + memset(shhwtstamps, 0, sizeof(*shhwtstamps)); > > + > > + ts.tv_sec = (u64)le32_to_cpu(desc->ts_sec); > > + ts.tv_nsec = le32_to_cpu(desc->ts_nsec & cpu_to_le32(0x3fffffff)); > > + > > + shhwtstamps->hwtstamp = timespec64_to_ktime(ts); > > + } > > + > > + skb_put(skb, pkt_len); > > + skb->protocol = eth_type_trans(skb, ndev); > > + netif_receive_skb(skb); > > + > > + /* Update statistics. */ > > + priv->stats.rx_packets++; > > I think it's better to do `priv->stats.rx_packets += rx_packets` once at > the end of this function instead of repeatedly incrementing another > variable on the hot path. We could do that, but I opted for this as it makes more sens to update rx_packets and rx_bytes in the same location IMHO. > > > + priv->stats.rx_bytes += pkt_len; > > + > > + /* Update counters. */ > > + priv->cur_rx++; > > + rx_packets++; > > + } > > + > > + /* Refill the RX ring buffers */ > > + for (; priv->cur_rx - priv->dirty_rx > 0; priv->dirty_rx++) { > > + const unsigned int entry = priv->dirty_rx % priv->num_rx_ring; > > + struct rtsn_ext_ts_desc *desc = &priv->rx_ring[entry]; > > + struct sk_buff *skb; > > + dma_addr_t dma_addr; > > + > > + desc->info_ds = cpu_to_le16(PKT_BUF_SZ); > > + > > + if (!priv->rx_skb[entry]) { > > + skb = napi_alloc_skb(&priv->napi, > > + PKT_BUF_SZ + RTSN_ALIGN - 1); > > + if (!skb) > > + break; > > + skb_reserve(skb, NET_IP_ALIGN); > > + dma_addr = dma_map_single(ndev->dev.parent, skb->data, > > + le16_to_cpu(desc->info_ds), > > + DMA_FROM_DEVICE); > > + if (dma_mapping_error(ndev->dev.parent, dma_addr)) > > + desc->info_ds = cpu_to_le16(0); > > + desc->dptr = cpu_to_le32(dma_addr); > > + skb_checksum_none_assert(skb); > > + priv->rx_skb[entry] = skb; > > + } > > + > > + dma_wmb(); > > + desc->die_dt = DT_FEMPTY | D_DIE; > > + } > > + > > + priv->rx_ring[priv->num_rx_ring].die_dt = DT_LINK; > > + > > + return rx_packets; > > +} > > [snip] > > > +static netdev_tx_t rtsn_start_xmit(struct sk_buff *skb, struct net_device *ndev) > > +{ > > + struct rtsn_private *priv = netdev_priv(ndev); > > + struct rtsn_ext_desc *desc; > > + int ret = NETDEV_TX_OK; > > + unsigned long flags; > > + dma_addr_t dma_addr; > > + int entry; > > + > > + spin_lock_irqsave(&priv->lock, flags); > > + > > + if (priv->cur_tx - priv->dirty_tx > priv->num_tx_ring) { > > + netif_stop_subqueue(ndev, 0); > > + ret = NETDEV_TX_BUSY; > > + goto out; > > + } > > + > > + if (skb_put_padto(skb, ETH_ZLEN)) > > + goto out; > > + > > + dma_addr = dma_map_single(ndev->dev.parent, skb->data, skb->len, > > + DMA_TO_DEVICE); > > + if (dma_mapping_error(ndev->dev.parent, dma_addr)) { > > + dev_kfree_skb_any(skb); > > + goto out; > > + } > > + > > + entry = priv->cur_tx % priv->num_tx_ring; > > + priv->tx_skb[entry] = skb; > > + desc = &priv->tx_ring[entry]; > > + desc->dptr = cpu_to_le32(dma_addr); > > + desc->info_ds = cpu_to_le16(skb->len); > > Should we check against the maximum TX frame size supported by the > hardware here? > > Whatever we do here, we should also do in the ravb driver as that makes > a similar cpu_to_le16() call to fill the DS field with no check that the > HW actually supports transmitting a frame of the given size. Compared to RAVB the RTSN driver do not support splitting a packet over multiple descriptors, so the max frame size adhering to the MTU will always fit using a single descriptor. > > > + desc->info1 = cpu_to_le64(skb->len); > > + > > + if (skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP) { > > + skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS; > > + priv->ts_tag++; > > + desc->info_ds |= cpu_to_le16(TXC); > > + desc->info = priv->ts_tag; > > + } > > + > > + skb_tx_timestamp(skb); > > + dma_wmb(); > > + > > + desc->die_dt = DT_FSINGLE | D_DIE; > > + priv->cur_tx++; > > + > > + /* Start xmit */ > > + rtsn_write(priv, TRCR0, BIT(TX_CHAIN_IDX)); > > +out: > > + spin_unlock_irqrestore(&priv->lock, flags); > > + return ret; > > +} > > Thanks, > > -- > Paul Barker -- Kind Regards, Niklas Söderlund