Re: [PATCHv2 net-next] ip_gre: remove CRC flag from dev features in gre_gso_segment

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jan 13, 2021 at 10:11 AM Alexander Duyck
<alexander.duyck@xxxxxxxxx> wrote:
>
> On Mon, Jan 11, 2021 at 9:14 PM Xin Long <lucien.xin@xxxxxxxxx> wrote:
> >
> > On Tue, Jan 12, 2021 at 12:48 AM Alexander Duyck
> > <alexander.duyck@xxxxxxxxx> wrote:
> > >
> > > On Mon, Jan 11, 2021 at 5:22 AM Xin Long <lucien.xin@xxxxxxxxx> wrote:
> > > >
> > > > This patch is to let it always do CRC checksum in sctp_gso_segment()
> > > > by removing CRC flag from the dev features in gre_gso_segment() for
> > > > SCTP over GRE, just as it does in Commit 527beb8ef9c0 ("udp: support
> > > > sctp over udp in skb_udp_tunnel_segment") for SCTP over UDP.
> > > >
> > > > It could set csum/csum_start in GSO CB properly in sctp_gso_segment()
> > > > after that commit, so it would do checksum with gso_make_checksum()
> > > > in gre_gso_segment(), and Commit 622e32b7d4a6 ("net: gre: recompute
> > > > gre csum for sctp over gre tunnels") can be reverted now.
> > > >
> > > > Note that the current HWs like igb NIC can only handle the SCTP CRC
> > > > when it's in the outer packet, not in the inner packet like in this
> > > > case, so here it removes CRC flag from the dev features even when
> > > > need_csum is false.
> > >
> > > So the limitation in igb is not the hardware but the driver
> > > configuration. When I had coded things up I put in a limitation on the
> > > igb_tx_csum code that it would have to validate that the protocol we
> > > are requesting an SCTP CRC offload since it is a different calculation
> > > than a 1's complement checksum. Since igb doesn't support tunnels we
> > > limited that check to the outer headers.
> > Ah.. I see, thanks.
> > >
> > > We could probably enable this for tunnels as long as the tunnel isn't
> > > requesting an outer checksum offload from the driver.
> > I think in igb_tx_csum(), by checking skb->csum_not_inet would be enough
> > to validate that is a SCTP request:
> > -               if (((first->protocol == htons(ETH_P_IP)) &&
> > -                    (ip_hdr(skb)->protocol == IPPROTO_SCTP)) ||
> > -                   ((first->protocol == htons(ETH_P_IPV6)) &&
> > -                    igb_ipv6_csum_is_sctp(skb))) {
> > +               if (skb->csum_not_inet) {
> >                         type_tucmd = E1000_ADVTXD_TUCMD_L4T_SCTP;
> >                         break;
> >                 }
> >
>
> So if I may ask. Why go with something like csum_not_inet instead of
> specifying something like crc32_csum? I'm just wondering if there are
> any other non-1's complement checksums that we are dealing with?
I don't think there is, here is the thread of that patch:

https://lore.kernel.org/netdev/CALx6S36rem=OuN_At6qYA=se5cpuYM1N2R8efoaszvo8b8Tz5A@xxxxxxxxxxxxxx/

I'm writing GRE checksum, and trying to change csum_not_inet:1 to
csum_type:2, by doing the below, no bit hole occurs:

-       __u8                    csum_not_inet:1;
-       __u8                    dst_pending_confirm:1;
+       __u8                    csum_type:2;
 #ifdef CONFIG_IPV6_NDISC_NODETYPE
        __u8                    ndisc_nodetype:2;
 #endif
+       __u8                    dst_pending_confirm:1;

and in skb_csum_hwoffload_help():
 int skb_csum_hwoffload_help(struct sk_buff *skb,
                            const netdev_features_t features)
 {
-       if (unlikely(skb->csum_not_inet))
-               return !!(features & NETIF_F_SCTP_CRC) ? 0 :
-                       skb_crc32c_csum_help(skb);
+       if (likely(!skb->csum_type))
+               return !!(features & NETIF_F_CSUM_MASK) ? 0 :
skb_checksum_help(skb);
+
+       if (skb->csum_type == CSUM_T_GENERIC) {
+               return !!(features & NETIF_F_HW_CSUM) ? 0 :
skb_checksum_help(skb);
+       } else if (skb->csum_type == CSUM_T_SCTP_CRC) {
+               return !!(features & NETIF_F_SCTP_CRC) ? 0 :
skb_crc32c_csum_help(skb);
+       } else {
+               pr_warn("Wrong csum type: %d\n", skb->csum_type);
+               return 1;
+       }

then the driver fix will be:
        case offsetof(struct sctphdr, checksum):
                /* validate that this is actually an SCTP request */
-               if (((first->protocol == htons(ETH_P_IP)) &&
-                    (ip_hdr(skb)->protocol == IPPROTO_SCTP)) ||
-                   ((first->protocol == htons(ETH_P_IPV6)) &&
-                    igb_ipv6_csum_is_sctp(skb))) {
+               if (skb->csum_type == CSUM_T_SCTP_CRC) {
                        type_tucmd = E1000_ADVTXD_TUCMD_L4T_SCTP;
                        break;
                }

then the gre csum set will be:
+                               skb->csum_type = CSUM_T_GENERIC;
+                               skb->ip_summed = CHECKSUM_PARTIAL;
+                               skb->csum_start =
skb_transport_header(skb) - skb->head;
+                               skb->csum_offset = sizeof(*greh);

>
> One thing we might want to do to make eventual backporting for this
> easier would be to add an accessor inline function. Maybe something
> like a skb_csum_is_crc32() so that for older kernels the function
> could just be defined to return false since the csum_not_inet may not
> exist.
>
> > Otherwise, we will need to parse the packet a little bit, as it does in
> > hns3_get_l4_protocol().
>
> Agreed. If the csum_not_inet means it is a crc32 checksum then we
> could just look at the offsets and as long as they are correct for
> sctp we could just go forward with what we have.
>
> > >
> > > > v1->v2:
> > > >   - improve the changelog.
> > > >   - fix "rev xmas tree" in varibles declaration.
> > > >
> > > > Signed-off-by: Xin Long <lucien.xin@xxxxxxxxx>
> > > > ---
> > > >  net/ipv4/gre_offload.c | 15 ++++-----------
> > > >  1 file changed, 4 insertions(+), 11 deletions(-)
> > > >
> > > > diff --git a/net/ipv4/gre_offload.c b/net/ipv4/gre_offload.c
> > > > index e0a2465..a681306 100644
> > > > --- a/net/ipv4/gre_offload.c
> > > > +++ b/net/ipv4/gre_offload.c
> > > > @@ -15,10 +15,10 @@ static struct sk_buff *gre_gso_segment(struct sk_buff *skb,
> > > >                                        netdev_features_t features)
> > > >  {
> > > >         int tnl_hlen = skb_inner_mac_header(skb) - skb_transport_header(skb);
> > > > -       bool need_csum, need_recompute_csum, gso_partial;
> > > >         struct sk_buff *segs = ERR_PTR(-EINVAL);
> > > >         u16 mac_offset = skb->mac_header;
> > > >         __be16 protocol = skb->protocol;
> > > > +       bool need_csum, gso_partial;
> > > >         u16 mac_len = skb->mac_len;
> > > >         int gre_offset, outer_hlen;
> > > >
> > > > @@ -41,10 +41,11 @@ static struct sk_buff *gre_gso_segment(struct sk_buff *skb,
> > > >         skb->protocol = skb->inner_protocol;
> > > >
> > > >         need_csum = !!(skb_shinfo(skb)->gso_type & SKB_GSO_GRE_CSUM);
> > > > -       need_recompute_csum = skb->csum_not_inet;
> > > >         skb->encap_hdr_csum = need_csum;
> > > >
> > > >         features &= skb->dev->hw_enc_features;
> > > > +       /* CRC checksum can't be handled by HW when SCTP is the inner proto. */
> > > > +       features &= ~NETIF_F_SCTP_CRC;
> > > >
> > > >         /* segment inner packet. */
> > > >         segs = skb_mac_gso_segment(skb, features);
> > >
> > > Do we have NICs that are advertising NETIF_S_SCTP_CRC as part of their
> > > hw_enc_features and then not supporting it? Based on your comment
> > Yes, igb/igbvf/igc/ixgbe/ixgbevf, they have a similar code of SCTP
> > proto validation.
>
> Yeah, it is old code. It was added in 4.6 before tunnels supported
> SCTP_CRC I am guessing. It looks like csum_not_inet wasn't added until
> 4.13. So it would probably be best to fix the drivers since the driver
> code is outdated.
>
> > > above it seems like you are masking this out because hardware is
> > > advertising features it doesn't actually support. I'm just wondering
> > > if that is the case or if this is something where this should be
> > > cleared if need_csum is set since we only support one level of
> > > checksum offload.
> > Since only these drivers only do SCTP proto validation, and "only
> > one level checksum offload" issue only exists when inner packet
> > is SCTP packet, clearing NETIF_F_SCTP_CRC should be enough.
> >
> > But seems to fix the drivers will be better, as hw_enc_features should
> > tell the correct features for inner proto. wdyt?
>
> Yes, it would be better to fix the drivers. However the one limitation
> is that this will only work when we don't have an outer checksum in
> place. If we have an outer checksum then we have to compute the crc32
> checksum and then offload the outer checksum if we can.
>
> > (Just note udp tunneling SCTP doesn't have this issue, as the outer
> >  udp checksum is always required by RFC)
But sctp over Vxlan/Geneve may still use noudpcsum, so need_csum
may still be false in there.

vxlan and geneve is not supporting fraglist, which sctp hw gso requires.
I will add NETIF_F_FRAGLIST flag for udp tunnel device in another patch.

Thanks.

>
> Thanks. I wasn't aware of that.
>
> > >
> > > > @@ -99,15 +100,7 @@ static struct sk_buff *gre_gso_segment(struct sk_buff *skb,
> > > >                 }
> > > >
> > > >                 *(pcsum + 1) = 0;
> > > > -               if (need_recompute_csum && !skb_is_gso(skb)) {
> > > > -                       __wsum csum;
> > > > -
> > > > -                       csum = skb_checksum(skb, gre_offset,
> > > > -                                           skb->len - gre_offset, 0);
> > > > -                       *pcsum = csum_fold(csum);
> > > > -               } else {
> > > > -                       *pcsum = gso_make_checksum(skb, 0);
> > > > -               }
> > > > +               *pcsum = gso_make_checksum(skb, 0);
> > > >         } while ((skb = skb->next));
> > > >  out:
> > > >         return segs;
> > > > --
> > > > 2.1.0
> > > >



[Index of Archives]     [Linux Networking Development]     [Linux OMAP]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux