Re: [RFC PATCH net-next 3/3] sctp: Add GSO support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Feb 1, 2016 at 8:22 AM, Marcelo Ricardo Leitner
<marcelo.leitner@xxxxxxxxx> wrote:
> Em 30-01-2016 02:07, Alexander Duyck escreveu:
>>
>> On Fri, Jan 29, 2016 at 11:42 AM, Marcelo Ricardo Leitner
>> <marcelo.leitner@xxxxxxxxx> wrote:
>>>
>>> On Fri, Jan 29, 2016 at 11:15:54AM -0800, Alexander Duyck wrote:
>>>>
>>>> On Wed, Jan 27, 2016 at 9:06 AM, Marcelo Ricardo Leitner
>
> ...
>
>>>>> diff --git a/net/core/dev.c b/net/core/dev.c
>>>>> index
>>>>> 8cba3d852f251c503b193823b71b27aaef3fb3ae..9583284086967c0746de5f553535e25e125714a5
>>>>> 100644
>>>>> --- a/net/core/dev.c
>>>>> +++ b/net/core/dev.c
>>>>> @@ -2680,7 +2680,11 @@ EXPORT_SYMBOL(skb_mac_gso_segment);
>>>>>   static inline bool skb_needs_check(struct sk_buff *skb, bool tx_path)
>>>>>   {
>>>>>          if (tx_path)
>>>>> -               return skb->ip_summed != CHECKSUM_PARTIAL;
>>>>> +               /* FIXME: Why only packets with checksum offloading are
>>>>> +                * supported for GSO?
>>>>> +                */
>>>>> +               return skb->ip_summed != CHECKSUM_PARTIAL &&
>>>>> +                      skb->ip_summed != CHECKSUM_UNNECESSARY;
>>>>>          else
>>>>>                  return skb->ip_summed == CHECKSUM_NONE;
>>>>>   }
>>>>
>>>>
>>>> Tom Herbert just got rid of the use of CHECKSUM_UNNECESSARY in the
>>>> transmit path a little while ago.  Please don't reintroduce it.
>>>
>>>
>>> Can you give me some pointers on that? I cannot find such change.
>>> skb_needs_check() seems to be like that since beginning.
>>
>>
>> Maybe you need to update your kernel.  All this stuff was changed in
>> December and has been this way for a little while now.
>>
>> Commits:
>> 7a6ae71b24905 "net: Elaborate on checksum offload interface description"
>> 253aab0597d9e "fcoe: Use CHECKSUM_PARTIAL to indicate CRC offload"
>> 53692b1de419c "sctp: Rename NETIF_F_SCTP_CSUM to NETIF_F_SCTP_CRC"
>>
>> The main reason I even noticed it is because of some of the work I did
>> on the Intel NIC offloads.
>
>
> Ok I have those here, but my need here is different. I want to do GSO with
> packets that won't have CRC offloaded, so I shouldn't use CHECKSUM_PARTIAL
> but something else.

CHECKSUM_NONE if you don't want to have any of the CRC or checksums
offloaded.  However as I mentioned before you will want to fake it
then since skb_segment assumes it is doing a 1's compliment checksum
so you will want to pass NET_F_HW_CSUM as a feature flag and then set
CHECKSUM_NONE after the frame has been segmented.

> ...
>
>>>>> diff --git a/net/sctp/offload.c b/net/sctp/offload.c
>>>>> index
>>>>> 7080a6318da7110c1688dd0c5bb240356dbd0cd3..3b96035fa180a4e7195f7b6e7a8be7b97c8f8b26
>>>>> 100644
>>>>> --- a/net/sctp/offload.c
>>>>> +++ b/net/sctp/offload.c
>>>>> @@ -36,8 +36,61 @@
>>>>>   #include <net/sctp/checksum.h>
>>>>>   #include <net/protocol.h>
>>>>>
>>>>> +static __le32 sctp_gso_make_checksum(struct sk_buff *skb)
>>>>> +{
>>>>> +       skb->ip_summed = CHECKSUM_NONE;
>>>>> +       return sctp_compute_cksum(skb, skb_transport_offset(skb));
>>>>> +}
>>>>> +
>>>>
>>>>
>>>> I really despise the naming of this bit here.  SCTP does not use a
>>>> checksum.  It uses a CRC.  Please don't call this a checksum as it
>>>> will just make the code really confusing.   I think the name should be
>>>> something like gso_make_crc32c.
>>>
>>>
>>> Agreed. SCTP code still references it as 'cksum'. I'll change that in
>>> another patch.
>>>
>>>> I think we need to address the CRC issues before we can really get
>>>> into segmentation.  Specifically we need to be able to offload SCTP
>>>> and FCoE in software since they both use the CHECKSUM_PARTIAL value
>>>> and then we can start cleaning up more of this mess and move onto
>>>> segmentation.
>>>
>>>
>>> Hm? The mess on CRC issues here is caused by this patch alone. It's good
>>> as it is today. And a good part of this mess is caused by trying to GSO
>>> without offloading CRC too.
>>>
>>> Or you mean that SCTP and FCoE should stop using CHECKSUM_* at all?
>>
>>
>> Well after Tom's change both SCTP and FCoE use CHECKSUM_PARTIAL.
>> CHECKSUM_PARTIAL is what is used to indicate to the hardware that a
>> checksum offload has been requested so that is what is looked for at
>> the driver level.
>
>
> SCTP was actually already using CHECKSUM_PARTIAL. That patch was just a
> rename in an attempt to make this crc difference more evident. Yet I'll
> continue the rename within sctp code.

Yeah it was FCoE that was doing something different.

>> My concern with all this is that we should probably be looking at
>> coming up with a means of offloading this in software when
>> skb_checksum_help is called.  Right now validate_xmit_skb doesn't have
>> any understanding of what to do with SCTP or FCoE and will try to just
>> compute a checksum for them.
>
>
> My worry is placed a bit earlier than that, I think. Currently I just cannot
> do GSO with packets that doesn't have checksum/crc offloaded too because
> validate_xmit_skb() will complain.

That is probably because you are passing CHECKSUM_PARTIAL instead of
CHECKSUM_NONE.

> As NICs hardly have sctp crc offloading capabilities, I'm thinking it makes
> sense to do GSO even without crc offloaded. After all, it doesn't matter
> much in which stage we are computing the crc as we are computing it anyway.

Agreed.  You will need to support CHECKSUM_PARTIAL being passed to a
device that doesn't support SCTP first.  That way you can start
looking at just always setting CHECKSUM_PARTIAL in the transport layer
which is really needed if you want to do SCO (SCTP Segmentation
Offload) in the first place.  Once you have that you could then start
looking at doing the SCO since from that point on you should already
be in good shape to address those type of issues.  You should probably
use the csum_offset value in the skb in order to flag if this is
possibly SCTP.  As far as I know for now there shouldn't be any other
protocols that are using the same offset, and if needed you can
actually parse the headers to verify if the frame is actually SCTP.

>>>>> +static struct sk_buff *sctp_gso_segment(struct sk_buff *skb,
>>>>> +                                       netdev_features_t features)
>>>>> +{
>>>>> +       struct sk_buff *segs = ERR_PTR(-EINVAL);
>>>>> +       struct sctphdr *sh;
>>>>> +
>>>>> +       sh = sctp_hdr(skb);
>>>>> +       if (!pskb_may_pull(skb, sizeof(*sh)))
>>>>> +               goto out;
>>>>> +
>>>>> +       __skb_pull(skb, sizeof(*sh));
>>>>> +
>>>>> +       if (skb_gso_ok(skb, features | NETIF_F_GSO_ROBUST)) {
>>>>> +               /* Packet is from an untrusted source, reset gso_segs.
>>>>> */
>>>>> +               int type = skb_shinfo(skb)->gso_type;
>>>>> +
>>>>> +               if (unlikely(type &
>>>>> +                            ~(SKB_GSO_SCTP | SKB_GSO_DODGY |
>>>>> +                              0) ||
>>>>> +                            !(type & (SKB_GSO_SCTP))))
>>>>> +                       goto out;
>>>>> +
>>>>> +               /* This should not happen as no NIC has SCTP GSO
>>>>> +                * offloading, it's always via software and thus we
>>>>> +                * won't send a large packet down the stack.
>>>>> +                */
>>>>> +               WARN_ONCE(1, "SCTP segmentation offloading to NICs is
>>>>> not supported.");
>>>>> +               goto out;
>>>>> +       }
>>>>> +
>>>>
>>>>
>>>> So what you are going to end up needing here is some way to tell the
>>>> hardware that you are doing the checksum no matter what.  There is no
>>>> value in you computing a 1's compliment checksum for the payload if
>>>> you aren't going to use it.  What you can probably do is just clear
>>>> the standard checksum flags and then OR in NETIF_F_HW_CSUM if
>>>> NETIF_F_SCTP_CRC is set and that should get skb_segment to skip
>>>> offloading the checksum.
>>>
>>>
>>> Interesting, ok
>>>
>>>> One other bit that will make this more complicated is if we ever get
>>>> around to supporting SCTP in tunnels.  Then we will need to sort out
>>>> how things like remote checksum offload should impact SCTP, and how to
>>>> deal with needing to compute both a CRC and 1's compliment checksum.
>>>> What we would probably need to do is check for encap_hdr_csum and if
>>>> it is set and we are doing SCTP then we would need to clear the
>>>> NETIF_F_HW_CSUM, NETIF_F_IP_CSUM, and NETIF_F_IPV6_CSUM flags.
>>>
>>>
>>> Yup. And that includes on storing pointers to where to store each of it.
>>
>>
>> Actually the pointers bit is easy.  The csum_start and csum_offset
>> values should be set up after you have segmented the skb and should be
>> updated after the skb has been segmented.  If nothing else you can
>> probably take a look at the TCP code tcp_gso_segment and
>> tcp4_gso_segment for inspiration.  Basically you need to make sure
>> that you set the ip_summed, csum_start, and csum_offset values for
>> your first frame before you start segmenting it into multiple frames.
>
>
> Ah yes, ok, that's for now, when not doing crc offloading with some chksum
> offloading (tunnel) too.

Actually that would be regardless of tunnel offloading.  We don't
store the outer checksum offsets.  If we need outer checksum we
restore them after the fact since the inner checksum offsets are
needed as part of the inner header TCP checksum computation.

>>>>> +       segs = skb_segment(skb, features);
>>>>> +       if (IS_ERR(segs))
>>>>> +               goto out;
>>>>> +
>>>>> +       /* All that is left is update SCTP CRC if necessary */
>>>>> +       for (skb = segs; skb; skb = skb->next) {
>>>>> +               if (skb->ip_summed != CHECKSUM_PARTIAL) {
>>>>> +                       sh = sctp_hdr(skb);
>>>>> +                       sh->checksum = sctp_gso_make_checksum(skb);
>>>>> +               }
>>>>> +       }
>>>>> +
>>>>
>>>>
>>>> Okay, so it looks like you are doing the right thing here and leaving
>>>> this as CHECKSUM_PARTIAL.
>>>
>>>
>>> Actually no then. sctp_gso_make_checksum() replaces it:
>>> +static __le32 sctp_gso_make_checksum(struct sk_buff *skb)
>>> +{
>>> +       skb->ip_summed = CHECKSUM_NONE;
>>> +       return sctp_compute_cksum(skb, skb_transport_offset(skb));
>>>
>>> Why again would have to leave it as CHECKSUM_PARTIAL? IP header?
>>
>>
>> My earlier comment is actually incorrect.  This section is pretty much
>> broken since CHECKSUM_PARTIAL only reflects a 1's compliment checksum
>> in the case of skb_segment so whatever the value it is worthless.
>> CHECKSUM_PARTIAL is used to indicate if a given frame needs to be
>> offloaded.  It is meant to let the device know that it still needs to
>> compute a checksum or CRC beginning at csum_start and then storing the
>> new value at csum_offset.  However for skb_segment it is actually
>> referring to a 1's compliment checksum and if it returns CHECKSUM_NONE
>> it means it is stored in skb->csum which would really wreck things for
>> you since that was your skb->csum_start and skb->csum_offset values.
>> I have a patch to change this so that we update a checksum in the
>> SKB_GSO_CB, but I wasn't planning on submitting that until net-next
>> opens.
>
>
> sctp currently ignores skb->csum. It doesn't mess with the crc but computing
> it is at least not optimal, yes.

Actually sctp sets csum_start and csum_offset if it sets
CHECKSUM_PARTIAL.  So it does mess with skb->csum since it is
contained in a union with those two fields.

>> In the case of SCTP you probably don't even need to bother checking
>> the value since it is meaningless as skb_segment doesn't know how to
>> do an SCTP checksum anyway.  To that end for now what you could do is
>> just set NETIF_F_HW_CSUM.  This way skb_segment won't go and try to
>> compute a 1's compliment checksum on the payload since there is no
>> actual need for it.
>
>
> Nice, ok.
>
>> One other bit you will need to do is to check the value of SCTP_CRC
>> outside of skb_segment.  You might look at how
>> __skb_udp_tunnel_segment does this to populate its own offload_csum
>> boolean value, though you would want to use features, not
>> skb->dev->features as that is a bit of a workaround since features is
>> stripped by hw_enc_features in some paths if I recall correctly.
>
>>
>>
>> Once the frames are segmented and if you don't support the offload you
>> could then call gso_make_crc32c() or whatever you want to name it to
>> perform the CRC calculation and populate the field.  One question by
>
>
> Hmmm.. does it mean that we can use CHECKSUM_PARTIAL then even if CRC
> offloading is not possible then? Because the packet will not be offloaded in
> the end, yes, but this solves my questions above. Then while doing GSO, it
> re-evaluates if it can offload crc or not?

If you compute the CRC you set CHECKSUM_NONE, if you want the device
to do it on transmit you should set CHECKSUM_PARTIAL.

>> the way.  Don't you need to initialize the checksum value to 0 before
>> you compute it?  I think you might have missed that step when you were
>> setting this up.
>
>
> It's fine :) sctp_compute_cksum will replace it with zeroes, calculate, and
> put back the old value, which then we overwrite with the new one at
> sctp_gso_segment.

Right but there are scenarios where this will be offloaded isn't
there?  You would probably be better off setting the CRC to 0 before
you start segmentation and then that way you can either just set
csum_offset, csum_start and ip_summed if the lower device supports
SCTP CRC offload, otherwise you can just compute it without the need
to write the 0 into the header.
--
To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Networking Development]     [Linux OMAP]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux