Hey, On Thu, 24 Jul 2014 09:22:19 +0300 Julian Anastasov <ja@xxxxxx> wrote: > > Hello, > > On Thu, 24 Jul 2014, Julian Anastasov wrote: > > > There is skb_checksum_help() call in > > dev_hard_start_xmit(), it is used when TCP/UDP offload is > > not supported. After checking the code I see only the > > CHECKSUM_PARTIAL + Enabled TCP/UDP CSum as a problem because > > the drivers use the skb->encapsulation flag to know where the > > TCP/UDP header resides. As tunnels prepend tunnel header > > they should set skb->encapsulation=1 together with > > calling skb_reset_inner_headers() before the header is > > inserted. > > > > Following is a patch for the net tree that needs > > testing because I don't have setup to fully test it. > > I hope you can try it on your setup for the 3 tests. > > It should have effect only on your 2nd test. Do you > > have problems with tests 1 and 3? > > I just saw that changing skb->transport_header > before skb_reset_inner_headers() and iptunnel_handle_offloads() > can cause problems. Not sure if skb->transport_header > is used later when skb->encapsulation = 1, the strange thing > is that ip6_tnl_xmit2() changes it before skb_reset_inner_headers(), > this can lead to wrong skb->inner_transport_header. > > [PATCH net] ipvs: properly declare tunnel encapsulation > > The tunneling method should properly use tunnel encapsulation. > Fixes problem with CHECKSUM_PARTIAL packets when TCP/UDP csum > offload is supported and skb->encapsulation is not set to 1. > > Signed-off-by: Julian Anastasov <ja@xxxxxx> > --- > net/netfilter/ipvs/ip_vs_xmit.c | 22 +++++++++++++++++----- > 1 file changed, 17 insertions(+), 5 deletions(-) > > diff --git a/net/netfilter/ipvs/ip_vs_xmit.c > b/net/netfilter/ipvs/ip_vs_xmit.c index 73ba1cc..e093580 100644 > --- a/net/netfilter/ipvs/ip_vs_xmit.c > +++ b/net/netfilter/ipvs/ip_vs_xmit.c > @@ -38,6 +38,7 @@ > #include <net/route.h> /* for ip_route_output */ > #include <net/ipv6.h> > #include <net/ip6_route.h> > +#include <net/ip_tunnels.h> > #include <net/addrconf.h> > #include <linux/icmpv6.h> > #include <linux/netfilter.h> > @@ -483,10 +484,8 @@ static inline int > ip_vs_tunnel_xmit_prepare(struct sk_buff *skb, skb->ipvs_property = 1; > if (unlikely(cp->flags & IP_VS_CONN_F_NFCT)) > ret = ip_vs_confirm_conntrack(skb); > - if (ret == NF_ACCEPT) { > + if (ret == NF_ACCEPT) > nf_reset(skb); > - skb_forward_csum(skb); > - } > return ret; > } > > @@ -862,14 +861,19 @@ ip_vs_tunnel_xmit(struct sk_buff *skb, struct > ip_vs_conn *cp, old_iph = ip_hdr(skb); > } > > - skb->transport_header = skb->network_header; > - > /* fix old IP header checksum */ > ip_send_check(old_iph); > > + skb = iptunnel_handle_offloads(skb, false, SKB_GSO_IPIP); > + if (IS_ERR(skb)) > + goto tx_error_unlock; > + > + skb->transport_header = skb->network_header; > + > skb_push(skb, sizeof(struct iphdr)); > skb_reset_network_header(skb); > memset(&(IPCB(skb)->opt), 0, sizeof(IPCB(skb)->opt)); > + skb_clear_hash(skb); > > /* > * Push down and install the IPIP header. > @@ -901,6 +905,8 @@ ip_vs_tunnel_xmit(struct sk_buff *skb, struct > ip_vs_conn *cp, > tx_error: > kfree_skb(skb); > + > +tx_error_unlock: > rcu_read_unlock(); > LeaveFunction(10); > return NF_STOLEN; > @@ -953,6 +959,12 @@ ip_vs_tunnel_xmit_v6(struct sk_buff *skb, struct > ip_vs_conn *cp, old_iph = ipv6_hdr(skb); > } > > + if (!skb->encapsulation) { > + skb_reset_inner_headers(skb); > + skb->encapsulation = 1; > + } > + skb_forward_csum(skb); > + > skb->transport_header = skb->network_header; > > skb_push(skb, sizeof(struct ipv6hdr)); So I've found a patch that addresses the problem in what I suspect is the right way. diff --git a/net/netfilter/ipvs/ip_vs_xmit.c b/net/netfilter/ipvs/ip_vs_xmit.c index 6f70bdd..ea7ef5e 100644 --- a/net/netfilter/ipvs/ip_vs_xmit.c +++ b/net/netfilter/ipvs/ip_vs_xmit.c @@ -486,6 +486,7 @@ static inline int ip_vs_tunnel_xmit_prepare(struct sk_buff *skb, if (ret == NF_ACCEPT) { nf_reset(skb); skb_forward_csum(skb); + skb->encapsulation = 1; } return ret; } Empirically, this solves my problem :) I believe it solves the problem more generally as well. Digging deeper into the v6 case (FB <3's v6) ip_vs_tunnel_xmit_v6 -> ip6_local_out -> __ip6_local_out -> dst_output -> dst_output_sk -> skb_dst(skb)->output (= ip6_output) -> ip6_finish_output -> ip6->finish_output2 -> neigh_direct_output -> dev_queue_xmit -> __dev_queue_xmit -> dev_hard_start And then we run into this: /* If encapsulation offload request, verify we are testing * hardware encapsulation features instead of standard * features for the netdev */ if (skb->encapsulation) features &= dev->hw_enc_features; This essentially strips out NETIF_F_ALL_CSUM, so we end up invoking skb_checksum_help below. /* If packet is not checksummed and device does not * support checksumming for this protocol, complete * checksumming here. */ if (skb->ip_summed == CHECKSUM_PARTIAL) { if (skb->encapsulation) skb_set_inner_transport_header(skb, skb_checksum_start_offset(skb)); else skb_set_transport_header(skb, skb_checksum_start_offset(skb)); if (!(features & NETIF_F_ALL_CSUM) && skb_checksum_help(skb)) goto out_kfree_skb; } Does this seem like a reasonable approach to you, Julian? -- To unsubscribe from this list: send the line "unsubscribe lvs-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html