Re: [net-next,v8] net: ethernet: rtsn: Add support for Renesas Ethernet-TSN

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 13/06/2024 11:41, Niklas Söderlund wrote:
> Add initial support for Renesas Ethernet-TSN End-station device of R-Car
> V4H. The Ethernet End-station can connect to an Ethernet network using a
> 10 Mbps, 100 Mbps, or 1 Gbps full-duplex link via MII/GMII/RMII/RGMII.
> Depending on the connected PHY.
> 
> The driver supports Rx checksum and offload and hardware timestamps.
> 
> While full power management and suspend/resume is not yet supported the
> driver enables runtime PM in order to enable the module clock. While
> explicit clock management using clk_enable() would suffice for the
> supported SoC, the module could be reused on SoCs where the module is
> part of a power domain.
> 
> Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@xxxxxxxxxxxx>
> ---
> * Changes since v7
> - Properly handle Rx when netpoll is called with 0 budget. Previously
>    one Rx packet would be processed in this case.
> - Use napi_alloc_skb() instead of netdev_alloc_skb() in Rx.
> - Use device implemented 64 bit stats instead of using stats member in
>    struct net_device.
> - Implement ndo_hwtstamp_{g,s}et instead of acting on SIOC{G,S}HWTSTAMP
>    in do_ioctl().
> - Fix incorrect error sign when using PTR_ERR().
> - Add missing cosmetic space when creating struct of_device_id.
> - Rename label error_alloc to error_free.
> 
> * Changes since v6
> - Fix warning added when removing delays depending on phy-mode logic.
> 
> * Changes since v5
> - Removed delays depending on phy-mode logic. This is only needed on
>    some SoCs (V4H multiple boards), and currently we can't test on them.
>    As this have been a hot topic in review drop it for now so we can
>    support V4H single boards which we can test and lets work on the delay
>    on-top when we can test it.
> 
> * Changes since v4
> - Enable GPOUT_RDM and GPOUT_TDM delays depending on phy-mode.
> 
> * Changes since v3
> - Add description to commit message why PM operations are used instead
>    of explicit management of the clock.
> 
> * Changes since v2
> - Drop extra call to ether_setup(), already called by
>    alloc_etherdev_mqs().
> - Remove dependency on MII.
> - Improve error paths when requestion IRQs.
> - Correct the interpretation of which  phy-mode to apply delays for, and
>    mask the phy-mode passed to the PHY if MAC adds delays.
> 
> * Changes since v1
> - Remove C45 read/write functions, the phy-core can do that for us.
> - Rework the driver to register the MDIO bus at probe time instead of at
>    ndo_open().
> - Rework rtsn_rx() to take advantages of the improved design proposed
>    and upstreamed for R-Car RAVB driver Rx code-path.
> - Use napi_complete_done() instead of napi_complete().
> - Update commit message to list that the driver supports Rx CSUM
>    offload.
> - Drop unneeded __iowmb() left from development as well as a stray ;.
> - Consider all RGMII modes.
> - Move the phy_stop() call to mirror where phy_start() is called.
> - Forward IOCTLS from rtsn_do_ioctl() to PHY using ndo_eth_ioctl()
> - Align variable names to match core for IOCTLS functions.
> - Make sure DMA mask and PTP is registered before registering the ndev.
> - Document why the RTSN driver needs to be able to apply delays
> - Add checks for which phy-modes the MAC driver can apply delays
> - Use snprintf() instead of sprintf()
> 
> * Changes since RFC
> - Fix issues in MDIO communication.
> - Use a dedicated OF node for the MDIO bus.
> ---
>   MAINTAINERS                           |    8 +
>   drivers/net/ethernet/renesas/Kconfig  |   10 +
>   drivers/net/ethernet/renesas/Makefile |    2 +
>   drivers/net/ethernet/renesas/rtsn.c   | 1381 +++++++++++++++++++++++++
>   drivers/net/ethernet/renesas/rtsn.h   |  464 +++++++++
>   5 files changed, 1865 insertions(+)
>   create mode 100644 drivers/net/ethernet/renesas/rtsn.c
>   create mode 100644 drivers/net/ethernet/renesas/rtsn.h

[snip]

> +static int rtsn_rx(struct net_device *ndev, int budget)
> +{
> +	struct rtsn_private *priv = netdev_priv(ndev);
> +	unsigned int ndescriptors;
> +	unsigned int rx_packets;
> +	unsigned int i;
> +	bool get_ts;
> +
> +	get_ts = priv->ptp_priv->tstamp_rx_ctrl &
> +		RCAR_GEN4_RXTSTAMP_TYPE_V2_L2_EVENT;
> +
> +	ndescriptors = priv->dirty_rx + priv->num_rx_ring - priv->cur_rx;
> +	rx_packets = 0;
> +	for (i = 0; i < ndescriptors; i++) {
> +		const unsigned int entry = priv->cur_rx % priv->num_rx_ring;
> +		struct rtsn_ext_ts_desc *desc = &priv->rx_ring[entry];
> +		struct sk_buff *skb;
> +		dma_addr_t dma_addr;
> +		u16 pkt_len;
> +
> +		/* Stop processing descriptors if budget is consumed. */
> +		if (rx_packets >= budget)
> +			break;
> +
> +		/* Stop processing descriptors on first empty. */
> +		if ((desc->die_dt & DT_MASK) == DT_FEMPTY)
> +			break;
> +
> +		dma_rmb();
> +		pkt_len = le16_to_cpu(desc->info_ds) & RX_DS;
> +
> +		skb = priv->rx_skb[entry];
> +		priv->rx_skb[entry] = NULL;
> +		dma_addr = le32_to_cpu(desc->dptr);
> +		dma_unmap_single(ndev->dev.parent, dma_addr, PKT_BUF_SZ,
> +				 DMA_FROM_DEVICE);
> +
> +		/* Get timestamp if enabled. */
> +		if (get_ts) {
> +			struct skb_shared_hwtstamps *shhwtstamps;
> +			struct timespec64 ts;
> +
> +			shhwtstamps = skb_hwtstamps(skb);
> +			memset(shhwtstamps, 0, sizeof(*shhwtstamps));
> +
> +			ts.tv_sec = (u64)le32_to_cpu(desc->ts_sec);
> +			ts.tv_nsec = le32_to_cpu(desc->ts_nsec & cpu_to_le32(0x3fffffff));
> +
> +			shhwtstamps->hwtstamp = timespec64_to_ktime(ts);
> +		}
> +
> +		skb_put(skb, pkt_len);
> +		skb->protocol = eth_type_trans(skb, ndev);
> +		netif_receive_skb(skb);
> +
> +		/* Update statistics. */
> +		priv->stats.rx_packets++;

I think it's better to do `priv->stats.rx_packets += rx_packets` once at
the end of this function instead of repeatedly incrementing another
variable on the hot path.

> +		priv->stats.rx_bytes += pkt_len;
> +
> +		/* Update counters. */
> +		priv->cur_rx++;
> +		rx_packets++;
> +	}
> +
> +	/* Refill the RX ring buffers */
> +	for (; priv->cur_rx - priv->dirty_rx > 0; priv->dirty_rx++) {
> +		const unsigned int entry = priv->dirty_rx % priv->num_rx_ring;
> +		struct rtsn_ext_ts_desc *desc = &priv->rx_ring[entry];
> +		struct sk_buff *skb;
> +		dma_addr_t dma_addr;
> +
> +		desc->info_ds = cpu_to_le16(PKT_BUF_SZ);
> +
> +		if (!priv->rx_skb[entry]) {
> +			skb = napi_alloc_skb(&priv->napi,
> +					     PKT_BUF_SZ + RTSN_ALIGN - 1);
> +			if (!skb)
> +				break;
> +			skb_reserve(skb, NET_IP_ALIGN);
> +			dma_addr = dma_map_single(ndev->dev.parent, skb->data,
> +						  le16_to_cpu(desc->info_ds),
> +						  DMA_FROM_DEVICE);
> +			if (dma_mapping_error(ndev->dev.parent, dma_addr))
> +				desc->info_ds = cpu_to_le16(0);
> +			desc->dptr = cpu_to_le32(dma_addr);
> +			skb_checksum_none_assert(skb);
> +			priv->rx_skb[entry] = skb;
> +		}
> +
> +		dma_wmb();
> +		desc->die_dt = DT_FEMPTY | D_DIE;
> +	}
> +
> +	priv->rx_ring[priv->num_rx_ring].die_dt = DT_LINK;
> +
> +	return rx_packets;
> +}

[snip]

> +static netdev_tx_t rtsn_start_xmit(struct sk_buff *skb, struct net_device *ndev)
> +{
> +	struct rtsn_private *priv = netdev_priv(ndev);
> +	struct rtsn_ext_desc *desc;
> +	int ret = NETDEV_TX_OK;
> +	unsigned long flags;
> +	dma_addr_t dma_addr;
> +	int entry;
> +
> +	spin_lock_irqsave(&priv->lock, flags);
> +
> +	if (priv->cur_tx - priv->dirty_tx > priv->num_tx_ring) {
> +		netif_stop_subqueue(ndev, 0);
> +		ret = NETDEV_TX_BUSY;
> +		goto out;
> +	}
> +
> +	if (skb_put_padto(skb, ETH_ZLEN))
> +		goto out;
> +
> +	dma_addr = dma_map_single(ndev->dev.parent, skb->data, skb->len,
> +				  DMA_TO_DEVICE);
> +	if (dma_mapping_error(ndev->dev.parent, dma_addr)) {
> +		dev_kfree_skb_any(skb);
> +		goto out;
> +	}
> +
> +	entry = priv->cur_tx % priv->num_tx_ring;
> +	priv->tx_skb[entry] = skb;
> +	desc = &priv->tx_ring[entry];
> +	desc->dptr = cpu_to_le32(dma_addr);
> +	desc->info_ds = cpu_to_le16(skb->len);

Should we check against the maximum TX frame size supported by the
hardware here?

Whatever we do here, we should also do in the ravb driver as that makes
a similar cpu_to_le16() call to fill the DS field with no check that the
HW actually supports transmitting a frame of the given size.

> +	desc->info1 = cpu_to_le64(skb->len);
> +
> +	if (skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP) {
> +		skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS;
> +		priv->ts_tag++;
> +		desc->info_ds |= cpu_to_le16(TXC);
> +		desc->info = priv->ts_tag;
> +	}
> +
> +	skb_tx_timestamp(skb);
> +	dma_wmb();
> +
> +	desc->die_dt = DT_FSINGLE | D_DIE;
> +	priv->cur_tx++;
> +
> +	/* Start xmit */
> +	rtsn_write(priv, TRCR0, BIT(TX_CHAIN_IDX));
> +out:
> +	spin_unlock_irqrestore(&priv->lock, flags);
> +	return ret;
> +}

Thanks,

-- 
Paul Barker

Attachment: OpenPGP_0x27F4B3459F002257.asc
Description: OpenPGP public key

Attachment: OpenPGP_signature.asc
Description: OpenPGP digital signature


[Index of Archives]     [Linux Samsung SOC]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Device Mapper]

  Powered by Linux