On Mon, Nov 23, 2020 at 1:14 AM Xin Long <lucien.xin@xxxxxxxxx> wrote: > > On Sat, Nov 21, 2020 at 12:10 AM Alexander Duyck > <alexander.duyck@xxxxxxxxx> wrote: > > > > On Fri, Nov 20, 2020 at 2:23 AM Xin Long <lucien.xin@xxxxxxxxx> wrote: > > > > > > On Fri, Nov 20, 2020 at 1:24 AM Alexander Duyck > > > <alexander.duyck@xxxxxxxxx> wrote: > > > > > > > > On Wed, Nov 18, 2020 at 9:53 PM Xin Long <lucien.xin@xxxxxxxxx> wrote: > > > > > > > > > > On Thu, Nov 19, 2020 at 4:35 AM Alexander Duyck > > > > > <alexander.duyck@xxxxxxxxx> wrote: > > > > > > > > > > > > On Mon, Nov 16, 2020 at 1:17 AM Xin Long <lucien.xin@xxxxxxxxx> wrote: > > > > > > > <snip> > > > > > @@ -27,7 +27,11 @@ static __le32 sctp_gso_make_checksum(struct sk_buff *skb) > > > > > { > > > > > skb->ip_summed = CHECKSUM_NONE; > > > > > skb->csum_not_inet = 0; > > > > > - gso_reset_checksum(skb, ~0); > > > > > + /* csum and csum_start in GSO CB may be needed to do the UDP > > > > > + * checksum when it's a UDP tunneling packet. > > > > > + */ > > > > > + SKB_GSO_CB(skb)->csum = (__force __wsum)~0; > > > > > + SKB_GSO_CB(skb)->csum_start = skb_headroom(skb) + skb->len; > > > > > return sctp_compute_cksum(skb, skb_transport_offset(skb)); > > > > > } > > > > > > > > > > And yes, this patch has been tested with GRE tunnel checksums enabled. > > > > > (... icsum ocsum). > > > > > And yes, it was segmented in the lower NIC level, and we can make it by: > > > > > > > > > > # ethtool -K gre1 tx-sctp-segmentation on > > > > > # ethtool -K veth0 tx-sctp-segmentation off > > > > > > > > > > (Note: "tx-checksum-sctp" and "gso" are on for both devices) > > > > > > > > > > Thanks. > > > > > > > > I would also turn off Tx and Rx checksum offload on your veth device > > > > in order to make certain you aren't falsely sending data across > > > > indicating that the checksums are valid when they are not. It might be > > > > better if you were to run this over an actual NIC as that could then > > > > provide independent verification as it would be a fixed checksum test. > > > > > > > > I'm still not convinced this is working correctly. Basically a crc32c > > > > is not the same thing as a 1's complement checksum so you should need > > > > to compute both if you have an SCTP packet tunneled inside a UDP or > > > > GRE packet with a checksum. I don't see how computing a crc32c should > > > > automatically give you a 1's complement checksum of ~0. > > > > > > On the tx Path [1] below, the sctp crc checksum is calculated in > > > sctp_gso_make_checksum() [a], where it calls *sctp_compute_cksum()* > > > to do that, and as for the code in it: > > > > > > SKB_GSO_CB(skb)->csum = (__force __wsum)~0; > > > SKB_GSO_CB(skb)->csum_start = skb_headroom(skb) + skb->len; > > > > Okay, so I think I know how this is working, but the number of things > > relied on is ugly. Normally assuming skb_headroom(skb) + skb->len > > being valid for this would be a non-starter. I was assuming you > > weren't doing the 1's compliment checksum because you weren't using > > __skb_checksum to generate it. > > > > If I am not mistaken SCTP GSO uses the GSO_BY_FRAGS and apparently > > none of the frags are using page fragments within the skb. Am I > > understanding correctly? One thing that would help to address some of > > my concerns would be to clear instead of set NETIF_F_SG in > > sctp_gso_segment since your checksum depends on linear skbs. > Right, no frag is using page fragments for SCTP GSO. > NETIF_F_SG is set here, because in skb_segment(): > > if (hsize > len || !sg) > hsize = len; > > if (!hsize && i >= nfrags && skb_headlen(list_skb) && > (skb_headlen(list_skb) == len || sg)) { <------ > for flag_list > > without sg set, it won't go to this 'if' block, which is the process > of frag_list, see I don't think that is processing frag_list, it is processing frags. It is just updating list_skb as needed, however it is also configured outside of that block. > commit 89319d3801d1d3ac29c7df1f067038986f267d29 > Author: Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx> > Date: Mon Dec 15 23:26:06 2008 -0800 > > net: Add frag_list support to skb_segment > > do you think this might be a bug in skb_segment()? I would say it is assuming your logic is correct. Basically it should be able to segment the frame regardless of if the lower device supports NETIF_F_SG or not. > I was also thinking if SCTP GSO could go with the way of UDP's > with skb_segment_list() instead of GSO_BY_FRAGS things. > the different is that the head skb does not only include header, > but may also include userdata/payload with skb_segment_list(). The problem is right now the way the checksum is being configured you would have to keep the payload and data all in one logical data segment starting at skb->data. We cannot have any data stored in shinfo->frags, nor shinfo->frag_list. > > > > > is prepared for doing 1's complement checksum (for outer UDP/GRE), and used > > > in gre_gso_segment() [b], where it calls gso_make_checksum() to do that > > > when need_csum is set. Note that, here it played a TRICK: > > > > > > I set SKB_GSO_CB->csum_start to the end of this packet and > > > SKB_GSO_CB->csum = ~0 manually, so that in gso_make_checksum() it will do > > > the 1's complement checksum for outer UDP/GRE by summing all packet bits up. > > > See gso_make_checksum() (called by gre_gso_segment()): > > > > > > unsigned char *csum_start = skb_transport_header(skb); > > > int plen = (skb->head + SKB_GSO_CB(skb)->csum_start) - csum_start; > > > /* now plen is from udp header to the END of packet. */ > > > __wsum partial = SKB_GSO_CB(skb)->csum; > > > > > > return csum_fold(csum_partial(csum_start, plen, partial)); > > > > > > So yes, here it does compute both if I have an SCTP packet tunnelled inside > > > a UDP or GRE packet with a checksum. > > > > Assuming you have the payload data in the skb->data section. Normally > > payload is in page frags. That is why I was concerned. You have to > > have guarantees in place that there will not be page fragments > > attached to the skb. > On SCTP TX path, sctp_packet_transmit() will always alloc linear skbs > and reserve headers for lower-layer protocols. I think this will guarantee it. That ends up being my big concern. We need to make certain that is true for all GRO and GSO cases if we are going to operate on the assumption that just doing a linear checksum will work in the GSO code. Otherwise we need to make certain that segmentation will correct this for us if it cannot be guaranteed. That is why I would be much more comfortable if we were able to just drop the NETIF_F_SG bit when doing the segmentation since that would guarantee the results we are looking for.