Hi Sergei, On Sun, Jan 27, 2019 at 08:37:33PM +0300, Sergei Shtylyov wrote: > Add support for the RX checksum offload. This is enabled by default and > may be disabled and re-enabled using 'ethtool': > > # ethtool -K eth0 rx {on|off} > > Some Ether MACs provide a simple checksumming scheme which appears to be > completely compatible with CHECKSUM_COMPLETE: sum of all packet data after > the L2 header is appended to packet data; this may be trivially read by > the driver and used to update the skb accordingly. The same checksumming > scheme is implemented in the EtherAVB MACs and now supported by tha 'ravb' > driver. > > In terms of performance, throughput is close to gigabit line rate with the > RX checksum offload both enabled and disabled. The 'perf' output, however, > appears to indicate that significantly less time is spent in do_csum() -- > this is as expected. Nice. FYI, this seems similar to what I observed for RAVB, perhaps on H3 I don't exactly recall. On E3, which has less CPU power, I recently observed that with rx-csum enabled I can achieve gigabit line rate, but with rx-csum disabled throughput is significantly lower. I.e. on that system throughput is CPU bound with 1500 byte packets unless rx-csum enabled. Next point: 2da64300fbc ("ravb: expand rx descriptor data to accommodate hw checksum") is fresh in my mind and I wonder if mdp->rx_buf_sz needs to grow to ensure that there is always enough space for the csum. In particular, have you tested this with MTU-size frames with VLANs. (My test is to run iperf3 over a VLAN netdev, netperf over a VLAN netdev would likely work just as well.) > > Test results with RX checksum offload enabled: > > ~/netperf-2.2pl4# perf record -a ./netperf -t TCP_MAERTS -H 192.168.2.4 > TCP MAERTS TEST to 192.168.2.4 > Recv Send Send > Socket Socket Message Elapsed > Size Size Size Time Throughput > bytes bytes bytes secs. 10^6bits/sec > > 131072 16384 16384 10.01 933.93 > [ perf record: Woken up 8 times to write data ] > [ perf record: Captured and wrote 1.955 MB perf.data (41940 samples) ] > ~/netperf-2.2pl4# perf report > Samples: 41K of event 'cycles:ppp', Event count (approx.): 9915302763 > Overhead Command Shared Object Symbol > 9.44% netperf [kernel.kallsyms] [k] __arch_copy_to_user > 7.75% swapper [kernel.kallsyms] [k] _raw_spin_unlock_irq > 6.31% swapper [kernel.kallsyms] [k] default_idle_call > 5.89% swapper [kernel.kallsyms] [k] arch_cpu_idle > 4.37% swapper [kernel.kallsyms] [k] tick_nohz_idle_exit > 4.02% netperf [kernel.kallsyms] [k] _raw_spin_unlock_irq > 2.52% netperf [kernel.kallsyms] [k] preempt_count_sub > 1.81% netperf [kernel.kallsyms] [k] tcp_recvmsg > 1.80% netperf [kernel.kallsyms] [k] _raw_spin_unlock_irqres > 1.78% netperf [kernel.kallsyms] [k] preempt_count_add > 1.36% netperf [kernel.kallsyms] [k] __tcp_transmit_skb > 1.20% netperf [kernel.kallsyms] [k] __local_bh_enable_ip > 1.10% netperf [kernel.kallsyms] [k] sh_eth_start_xmit > > Test results with RX checksum offload disabled: > > ~/netperf-2.2pl4# perf record -a ./netperf -t TCP_MAERTS -H 192.168.2.4 > TCP MAERTS TEST to 192.168.2.4 > Recv Send Send > Socket Socket Message Elapsed > Size Size Size Time Throughput > bytes bytes bytes secs. 10^6bits/sec > 131072 16384 16384 10.01 932.04 > [ perf record: Woken up 14 times to write data ] > [ perf record: Captured and wrote 3.642 MB perf.data (78817 samples) ] > ~/netperf-2.2pl4# perf report > Samples: 78K of event 'cycles:ppp', Event count (approx.): 18091442796 > Overhead Command Shared Object Symbol > 7.00% swapper [kernel.kallsyms] [k] do_csum > 3.94% swapper [kernel.kallsyms] [k] sh_eth_poll > 3.83% ksoftirqd/0 [kernel.kallsyms] [k] do_csum > 3.23% swapper [kernel.kallsyms] [k] _raw_spin_unlock_irq > 2.87% netperf [kernel.kallsyms] [k] __arch_copy_to_user > 2.86% swapper [kernel.kallsyms] [k] arch_cpu_idle > 2.13% swapper [kernel.kallsyms] [k] default_idle_call > 2.12% ksoftirqd/0 [kernel.kallsyms] [k] sh_eth_poll > 2.02% swapper [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore > 1.84% swapper [kernel.kallsyms] [k] __softirqentry_text_start > 1.64% swapper [kernel.kallsyms] [k] tick_nohz_idle_exit > 1.53% netperf [kernel.kallsyms] [k] _raw_spin_unlock_irq > 1.32% netperf [kernel.kallsyms] [k] preempt_count_sub > 1.27% swapper [kernel.kallsyms] [k] __pi___inval_dcache_area > 1.22% swapper [kernel.kallsyms] [k] check_preemption_disabled > 1.01% ksoftirqd/0 [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore > > The above results collected on the R-Car V3H Starter Kit board. > > Based on the commit 4d86d3818627 ("ravb: RX checksum offload")... > > Signed-off-by: Sergei Shtylyov <sergei.shtylyov@xxxxxxxxxxxxxxxxxx> > > --- > drivers/net/ethernet/renesas/sh_eth.c | 60 ++++++++++++++++++++++++++++++++-- > drivers/net/ethernet/renesas/sh_eth.h | 1 > 2 files changed, 59 insertions(+), 2 deletions(-) > > Index: net-next/drivers/net/ethernet/renesas/sh_eth.c > =================================================================== > --- net-next.orig/drivers/net/ethernet/renesas/sh_eth.c > +++ net-next/drivers/net/ethernet/renesas/sh_eth.c > @@ -1532,8 +1532,9 @@ static int sh_eth_dev_init(struct net_de > mdp->irq_enabled = true; > sh_eth_write(ndev, mdp->cd->eesipr_value, EESIPR); > > - /* PAUSE Prohibition */ > + /* EMAC Mode: PAUSE prohibition; Duplex; RX Checksum; TX; RX */ > sh_eth_write(ndev, ECMR_ZPF | (mdp->duplex ? ECMR_DM : 0) | > + (ndev->features & NETIF_F_RXCSUM ? ECMR_RCSC : 0) | > ECMR_TE | ECMR_RE, ECMR); > > if (mdp->cd->set_rate) > @@ -1592,6 +1593,19 @@ static void sh_eth_dev_exit(struct net_d > update_mac_address(ndev); > } > > +static void sh_eth_rx_csum(struct sk_buff *skb) > +{ > + u8 *hw_csum; > + > + /* The hardware checksum is 2 bytes appended to packet data */ > + if (unlikely(skb->len < sizeof(__sum16))) > + return; > + hw_csum = skb_tail_pointer(skb) - sizeof(__sum16); > + skb->csum = csum_unfold((__force __sum16)get_unaligned_le16(hw_csum)); > + skb->ip_summed = CHECKSUM_COMPLETE; > + skb_trim(skb, skb->len - sizeof(__sum16)); > +} > + > /* Packet receive function */ > static int sh_eth_rx(struct net_device *ndev, u32 intr_status, int *quota) > { > @@ -1666,6 +1680,8 @@ static int sh_eth_rx(struct net_device * > DMA_FROM_DEVICE); > skb_put(skb, pkt_len); > skb->protocol = eth_type_trans(skb, ndev); > + if (ndev->features & NETIF_F_RXCSUM) > + sh_eth_rx_csum(skb); > netif_receive_skb(skb); > ndev->stats.rx_packets++; > ndev->stats.rx_bytes += pkt_len; > @@ -2921,6 +2937,39 @@ static void sh_eth_set_rx_mode(struct ne > spin_unlock_irqrestore(&mdp->lock, flags); > } > > +static void sh_eth_set_rx_csum(struct net_device *ndev, bool enable) > +{ > + struct sh_eth_private *mdp = netdev_priv(ndev); > + unsigned long flags; > + > + spin_lock_irqsave(&mdp->lock, flags); > + > + /* Disable TX and RX */ > + sh_eth_rcv_snd_disable(ndev); > + > + /* Modify RX Checksum setting */ > + sh_eth_modify(ndev, ECMR, ECMR_RCSC, enable ? ECMR_RCSC : 0); > + > + /* Enable TX and RX */ > + sh_eth_rcv_snd_enable(ndev); > + > + spin_unlock_irqrestore(&mdp->lock, flags); > +} > + > +static int sh_eth_set_features(struct net_device *ndev, > + netdev_features_t features) > +{ > + netdev_features_t changed = ndev->features ^ features; > + struct sh_eth_private *mdp = netdev_priv(ndev); > + > + if (changed & NETIF_F_RXCSUM && mdp->cd->rx_csum) > + sh_eth_set_rx_csum(ndev, features & NETIF_F_RXCSUM); > + > + ndev->features = features; > + > + return 0; > +} > + > static int sh_eth_get_vtag_index(struct sh_eth_private *mdp) > { > if (!mdp->port) > @@ -3102,6 +3151,7 @@ static const struct net_device_ops sh_et > .ndo_change_mtu = sh_eth_change_mtu, > .ndo_validate_addr = eth_validate_addr, > .ndo_set_mac_address = eth_mac_addr, > + .ndo_set_features = sh_eth_set_features, > }; > > static const struct net_device_ops sh_eth_netdev_ops_tsu = { > @@ -3117,6 +3167,7 @@ static const struct net_device_ops sh_et > .ndo_change_mtu = sh_eth_change_mtu, > .ndo_validate_addr = eth_validate_addr, > .ndo_set_mac_address = eth_mac_addr, > + .ndo_set_features = sh_eth_set_features, > }; > > #ifdef CONFIG_OF > @@ -3245,6 +3296,11 @@ static int sh_eth_drv_probe(struct platf > ndev->max_mtu = 2000 - (ETH_HLEN + VLAN_HLEN + ETH_FCS_LEN); > ndev->min_mtu = ETH_MIN_MTU; > > + if (mdp->cd->rx_csum) { > + ndev->features = NETIF_F_RXCSUM; > + ndev->hw_features = NETIF_F_RXCSUM; > + } > + > /* set function */ > if (mdp->cd->tsu) > ndev->netdev_ops = &sh_eth_netdev_ops_tsu; > @@ -3294,7 +3350,7 @@ static int sh_eth_drv_probe(struct platf > goto out_release; > } > mdp->port = port; > - ndev->features = NETIF_F_HW_VLAN_CTAG_FILTER; > + ndev->features |= NETIF_F_HW_VLAN_CTAG_FILTER; > > /* Need to init only the first port of the two sharing a TSU */ > if (port == 0) { > Index: net-next/drivers/net/ethernet/renesas/sh_eth.h > =================================================================== > --- net-next.orig/drivers/net/ethernet/renesas/sh_eth.h > +++ net-next/drivers/net/ethernet/renesas/sh_eth.h > @@ -500,6 +500,7 @@ struct sh_eth_cpu_data { > unsigned no_xdfar:1; /* E-DMAC DOES NOT have RDFAR/TDFAR */ > unsigned xdfar_rw:1; /* E-DMAC has writeable RDFAR/TDFAR */ > unsigned csmr:1; /* E-DMAC has CSMR */ > + unsigned rx_csum:1; /* EtherC has ECMR.RCSC */ > unsigned select_mii:1; /* EtherC has RMII_MII (MII select register) */ > unsigned rmiimode:1; /* EtherC has RMIIMODE register */ > unsigned rtrate:1; /* EtherC has RTRATE register */ >