On Tue, Nov 24, 2020 at 6:23 AM Alexander Duyck <alexander.duyck@xxxxxxxxx> wrote: > > On Mon, Nov 23, 2020 at 1:14 AM Xin Long <lucien.xin@xxxxxxxxx> wrote: > > > > On Sat, Nov 21, 2020 at 12:10 AM Alexander Duyck > > <alexander.duyck@xxxxxxxxx> wrote: > > > > > > On Fri, Nov 20, 2020 at 2:23 AM Xin Long <lucien.xin@xxxxxxxxx> wrote: > > > > > > > > On Fri, Nov 20, 2020 at 1:24 AM Alexander Duyck > > > > <alexander.duyck@xxxxxxxxx> wrote: > > > > > > > > > > On Wed, Nov 18, 2020 at 9:53 PM Xin Long <lucien.xin@xxxxxxxxx> wrote: > > > > > > > > > > > > On Thu, Nov 19, 2020 at 4:35 AM Alexander Duyck > > > > > > <alexander.duyck@xxxxxxxxx> wrote: > > > > > > > > > > > > > > On Mon, Nov 16, 2020 at 1:17 AM Xin Long <lucien.xin@xxxxxxxxx> wrote: > > > > > > > > > > <snip> > > > > > > > @@ -27,7 +27,11 @@ static __le32 sctp_gso_make_checksum(struct sk_buff *skb) > > > > > > { > > > > > > skb->ip_summed = CHECKSUM_NONE; > > > > > > skb->csum_not_inet = 0; > > > > > > - gso_reset_checksum(skb, ~0); > > > > > > + /* csum and csum_start in GSO CB may be needed to do the UDP > > > > > > + * checksum when it's a UDP tunneling packet. > > > > > > + */ > > > > > > + SKB_GSO_CB(skb)->csum = (__force __wsum)~0; > > > > > > + SKB_GSO_CB(skb)->csum_start = skb_headroom(skb) + skb->len; > > > > > > return sctp_compute_cksum(skb, skb_transport_offset(skb)); > > > > > > } > > > > > > > > > > > > And yes, this patch has been tested with GRE tunnel checksums enabled. > > > > > > (... icsum ocsum). > > > > > > And yes, it was segmented in the lower NIC level, and we can make it by: > > > > > > > > > > > > # ethtool -K gre1 tx-sctp-segmentation on > > > > > > # ethtool -K veth0 tx-sctp-segmentation off > > > > > > > > > > > > (Note: "tx-checksum-sctp" and "gso" are on for both devices) > > > > > > > > > > > > Thanks. > > > > > > > > > > I would also turn off Tx and Rx checksum offload on your veth device > > > > > in order to make certain you aren't falsely sending data across > > > > > indicating that the checksums are valid when they are not. It might be > > > > > better if you were to run this over an actual NIC as that could then > > > > > provide independent verification as it would be a fixed checksum test. > > > > > > > > > > I'm still not convinced this is working correctly. Basically a crc32c > > > > > is not the same thing as a 1's complement checksum so you should need > > > > > to compute both if you have an SCTP packet tunneled inside a UDP or > > > > > GRE packet with a checksum. I don't see how computing a crc32c should > > > > > automatically give you a 1's complement checksum of ~0. > > > > > > > > On the tx Path [1] below, the sctp crc checksum is calculated in > > > > sctp_gso_make_checksum() [a], where it calls *sctp_compute_cksum()* > > > > to do that, and as for the code in it: > > > > > > > > SKB_GSO_CB(skb)->csum = (__force __wsum)~0; > > > > SKB_GSO_CB(skb)->csum_start = skb_headroom(skb) + skb->len; > > > > > > Okay, so I think I know how this is working, but the number of things > > > relied on is ugly. Normally assuming skb_headroom(skb) + skb->len > > > being valid for this would be a non-starter. I was assuming you > > > weren't doing the 1's compliment checksum because you weren't using > > > __skb_checksum to generate it. > > > > > > If I am not mistaken SCTP GSO uses the GSO_BY_FRAGS and apparently > > > none of the frags are using page fragments within the skb. Am I > > > understanding correctly? One thing that would help to address some of > > > my concerns would be to clear instead of set NETIF_F_SG in > > > sctp_gso_segment since your checksum depends on linear skbs. > > Right, no frag is using page fragments for SCTP GSO. > > NETIF_F_SG is set here, because in skb_segment(): > > > > if (hsize > len || !sg) > > hsize = len; > > > > if (!hsize && i >= nfrags && skb_headlen(list_skb) && > > (skb_headlen(list_skb) == len || sg)) { <------ > > for flag_list > > > > without sg set, it won't go to this 'if' block, which is the process > > of frag_list, see > > I don't think that is processing frag_list, it is processing frags. It > is just updating list_skb as needed, however it is also configured > outside of that block. For SCTP's gso, we expect it going to the branch of matching (skb_headlen(list_skb) == len), as it will reuse the skb->data. > > > commit 89319d3801d1d3ac29c7df1f067038986f267d29 > > Author: Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx> > > Date: Mon Dec 15 23:26:06 2008 -0800 > > > > net: Add frag_list support to skb_segment > > > > do you think this might be a bug in skb_segment()? > > I would say it is assuming your logic is correct. Basically it should > be able to segment the frame regardless of if the lower device > supports NETIF_F_SG or not. > > > I was also thinking if SCTP GSO could go with the way of UDP's > > with skb_segment_list() instead of GSO_BY_FRAGS things. > > the different is that the head skb does not only include header, > > but may also include userdata/payload with skb_segment_list(). > > The problem is right now the way the checksum is being configured you > would have to keep the payload and data all in one logical data > segment starting at skb->data. We cannot have any data stored in > shinfo->frags, nor shinfo->frag_list. That's right, current SCTP protocol stack don't save tx data into frags or frag_list, and SCTP doesn't support GRO by now. > > > > > > > > is prepared for doing 1's complement checksum (for outer UDP/GRE), and used > > > > in gre_gso_segment() [b], where it calls gso_make_checksum() to do that > > > > when need_csum is set. Note that, here it played a TRICK: > > > > > > > > I set SKB_GSO_CB->csum_start to the end of this packet and > > > > SKB_GSO_CB->csum = ~0 manually, so that in gso_make_checksum() it will do > > > > the 1's complement checksum for outer UDP/GRE by summing all packet bits up. > > > > See gso_make_checksum() (called by gre_gso_segment()): > > > > > > > > unsigned char *csum_start = skb_transport_header(skb); > > > > int plen = (skb->head + SKB_GSO_CB(skb)->csum_start) - csum_start; > > > > /* now plen is from udp header to the END of packet. */ > > > > __wsum partial = SKB_GSO_CB(skb)->csum; > > > > > > > > return csum_fold(csum_partial(csum_start, plen, partial)); > > > > > > > > So yes, here it does compute both if I have an SCTP packet tunnelled inside > > > > a UDP or GRE packet with a checksum. > > > > > > Assuming you have the payload data in the skb->data section. Normally > > > payload is in page frags. That is why I was concerned. You have to > > > have guarantees in place that there will not be page fragments > > > attached to the skb. > > On SCTP TX path, sctp_packet_transmit() will always alloc linear skbs > > and reserve headers for lower-layer protocols. I think this will guarantee it. > > That ends up being my big concern. We need to make certain that is > true for all GRO and GSO cases if we are going to operate on the > assumption that just doing a linear checksum will work in the GSO > code. Otherwise we need to make certain that segmentation will correct > this for us if it cannot be guaranteed. That is why I would be much > more comfortable if we were able to just drop the NETIF_F_SG bit when > doing the segmentation since that would guarantee the results we are > looking for. before doing that, we should have a fix below: @@ -3850,8 +3850,6 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb, hsize = skb_headlen(head_skb) - offset; if (hsize < 0) hsize = 0; - if (hsize > len || !sg) - hsize = len; if (!hsize && i >= nfrags && skb_headlen(list_skb) && (skb_headlen(list_skb) == len || sg)) { @@ -3896,6 +3894,9 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb, skb_release_head_state(nskb); __skb_push(nskb, doffset); } else { + if (hsize > len || !sg) + hsize = len; + I believe it makes more sense to move this check to the 'else' branch.