1. Describe your problem: Since 69fec325a64383667b8a35df5d48d6ce52fb2782 SCTP changed behavior for GSO offload. commit 69fec325a64383667b8a35df5d48d6ce52fb2782 Author: Xin Long <lucien.xin@xxxxxxxxx> Date: Sun Nov 18 16:14:47 2018 +0800 Revert "sctp: remove sctp_transport_pmtu_check" This reverts commit 22d7be267eaa8114dcc28d66c1c347f667d7878a. The dst's mtu in transport can be updated by a non sctp place like in xfrm where the MTU information didn't get synced between asoc, transport and dst, so it is still needed to do the pmtu check in sctp_packet_config. Acked-by: Neil Horman <nhorman@xxxxxxxxxxxxx> Signed-off-by: David S. Miller <davem@xxxxxxxxxxxxx> SCTP GSO behaviour is different while spp_pathmtu is configured intentionally below bound interface(and route) MTU and PMTUD is disabled. 2. Topology: Sender <- (MTU 4098) -> router <- (MTU 1500) -> Receiver 3. To reproduce the case: 1. configure spp_flags = !SPP_PMTUD_ENABLE & SPP_PMTUD_DISABLE 2. configure spp_pathmtu 1392 3. send 2000 message 4. Snippet of log with behavior compare: behavior pre 69fec325a64383667b8a35df5d48d6ce52fb2782: IP packet got 2 chunks bundled to be offloaded by kernel: 2004-01-01T00:06:38.231152+00:00 fct [debug] kernel: [ 420.781878] sctp: ***sctp_transmit_packet*** 2004-01-01T00:06:38.231159+00:00 fct [debug] kernel: [ 420.781887] sctp: *** Chunk:80000000842d7100[DATA] TSN 0xf065bb80, length:1360, chunk->skb->len:1360, rtt_in_progress:1 2004-01-01T00:06:38.231166+00:00 fct [debug] kernel: [ 420.781895] sctp: *** Chunk:80000000842d7800[DATA] TSN 0xf065bb81, length:672, chunk->skb->len:672, rtt_in_progress:0 2004-01-01T00:06:38.231174+00:00 fct [debug] kernel: [ 420.781901] sctp: ***sctp_transmit_packet*** (head)skb->len:2044 Frame is offloaded: root@fct:~ >tcpdump -i rio0m3 sctp -vvNep tcpdump: listening on rio0m3, link-type EN10MB (Ethernet), capture size 262144 bytes fct.35341 > 192.168.253.16.12345: sctp 1) [INIT] [init tag: 2900307017] [rwnd: 1048576] [OS: 10] [MIS: 65535] [init TSN: 4033198976] 00:06:37.693554 0e:00:00:03:10:11 (oui Unknown) > 0e:00:00:03:20:11 (oui Unknown), ethertype IPv4 (0x0800), length 312: (tos 0x2,ECT(0), ttl 64, id 0, offset 0, flags [DF], proto SCTP (132), length 292) 192.168.253.16.12345 > fct.35341: sctp 1) [INIT ACK] [init tag: 2511511675] [rwnd: 1048576] [OS: 10] [MIS: 10] [init TSN: 4053520386] 00:06:37.693709 0e:00:00:03:20:11 (oui Unknown) > 0e:00:00:03:10:11 (oui Unknown), ethertype IPv4 (0x0800), length 278: (tos 0x2,ECT(0), ttl 64, id 0, offset 0, flags [DF], proto SCTP (132), length 264) fct.35341 > 192.168.253.16.12345: sctp 1) [COOKIE ECHO] 00:06:37.693841 0e:00:00:03:10:11 (oui Unknown) > 0e:00:00:03:20:11 (oui Unknown), ethertype IPv4 (0x0800), length 56: (tos 0x2,ECT(0), ttl 64, id 0, offset 0, flags [DF], proto SCTP (132), length 36) 192.168.253.16.12345 > fct.35341: sctp 1) [COOKIE ACK] 00:06:38.227178 0e:00:00:03:20:11 (oui Unknown) > 0e:00:00:03:10:11 (oui Unknown), ethertype IPv4 (0x0800), length 1406: (tos 0x2,ECT(0), ttl 64, id 31722, offset 0, flags [none], proto SCTP (132), length 1392) fct.35341 > 192.168.253.16.12345: sctp 1) [DATA] (B) [TSN: 4033198976] [SID: 0] [SSEQ 0] [PPID SBc-AP] [Payload: 0x0000: 4141 4141 4141 4141 4141 4141 4141 4141 AAAAAAAAAAAAAAAA 0x0010: 4141 4141 4141 4141 4141 4141 4141 4141 AAAAAAAAAAAAAAAA ... 0x0520: 4141 4141 4141 4141 4141 4141 4141 4141 AAAAAAAAAAAAAAAA 0x0530: 4141 4141 4141 4141 4141 4141 4141 4141 AAAAAAAAAAAAAAAA] 00:06:38.227194 0e:00:00:03:20:11 (oui Unknown) > 0e:00:00:03:10:11 (oui Unknown), ethertype IPv4 (0x0800), length 718: (tos 0x2,ECT(0), ttl 64, id 31723, offset 0, flags [none], proto SCTP (132), length 704) fct.35341 > 192.168.253.16.12345: sctp 1) [DATA] (E) [TSN: 4033198977] [SID: 0] [SSEQ 0] [PPID SBc-AP] [Payload: 0x0000: 4141 4141 4141 4141 4141 4141 4141 4141 AAAAAAAAAAAAAAAA 0x0010: 4141 4141 4141 4141 4141 4141 4141 4141 AAAAAAAAAAAAAAAA ... 0x0270: 4141 4141 4141 4141 4141 4141 4141 4141 AAAAAAAAAAAAAAAA 0x0280: 4141 4141 4141 4141 4141 4141 4141 4141 AAAAAAAAAAAAAAAA] 00:06:38.227307 0e:00:00:03:10:11 (oui Unknown) > 0e:00:00:03:20:11 (oui Unknown), ethertype IPv4 (0x0800), length 64: (tos 0x2,ECT(0), ttl 64, id 0, offset 0, flags [DF], proto SCTP (132), length 48) 192.168.253.16.12345 > fct.35341: sctp 1) [SACK] [cum ack 4033198976] [a_rwnd 1047232] [#gap acks 0] [#dup tsns 0] 00:06:38.429054 0e:00:00:03:10:11 (oui Unknown) > 0e:00:00:03:20:11 (oui Unknown), ethertype IPv4 (0x0800), length 64: (tos 0x2,ECT(0), ttl 64, id 0, offset 0, flags [DF], proto SCTP (132), length 48) 192.168.253.16.12345 > fct.35341: sctp 1) [SACK] [cum ack 4033198977] [a_rwnd 1048576] [#gap acks 0] [#dup tsns 0] 00:06:39.301336 0e:00:00:03:20:11 (oui Unknown) > 0e:00:00:03:10:11 (oui Unknown), ethertype IPv4 (0x0800), length 54: (tos 0x2,ECT(0), ttl 64, id 31758, offset 0, flags [none], proto SCTP (132), length 40) fct.35341 > 192.168.253.16.12345: sctp 1) [SHUTDOWN] 00:06:39.301463 0e:00:00:03:10:11 (oui Unknown) > 0e:00:00:03:20:11 (oui Unknown), ethertype IPv4 (0x0800), length 56: (tos 0x2,ECT(0), ttl 64, id 0, offset 0, flags [DF], proto SCTP (132), length 36) 192.168.253.16.12345 > fct.35341: sctp 1) [SHUTDOWN ACK] 00:06:39.301602 0e:00:00:03:20:11 (oui Unknown) > 0e:00:00:03:10:11 (oui Unknown), ethertype IPv4 (0x0800), length 50: (tos 0x2,ECT(0), ttl 64, id 31759, offset 0, flags [none], proto SCTP (132), length 36) fct.35341 > 192.168.253.16.12345: sctp 1) [SHUTDOWN COMPLETE] ^C 11 packets captured 11 packets received by filter 0 packets dropped by kernel behavior post 69fec325a64383667b8a35df5d48d6ce52fb2782: Same as above IP packet got 2 chunks boundled to be offloaded by kernel: 2004-01-01T00:31:37.478791+00:00 fct [debug] kernel: [ 1920.446450] sctp: sctp_packet_transmit: packet:0000000092f0f056 2004-01-01T00:31:37.478797+00:00 fct [debug] kernel: [ 1920.446460] sctp: *** Chunk:00000000bdbc0d5e[DATA] TSN 0x5927a1ff, length:1360, chunk->skb->len:1360, rtt_in_progress:1 2004-01-01T00:31:37.478804+00:00 fct [debug] kernel: [ 1920.446466] sctp: *** Chunk:0000000003d69060[DATA] TSN 0x5927a200, length:672, chunk->skb->len:672, rtt_in_progress:0 2004-01-01T00:31:37.478811+00:00 fct [debug] kernel: [ 1920.446483] sctp: ***sctp_transmit_packet*** skb->len:2044 Frame is not offloaded: root@fct:~ >tcpdump -i rio0m3 sctp -vvNep tcpdump: listening on rio0m3, link-type EN10MB (Ethernet), capture size 262144 bytes 00:31:36.755633 0e:00:00:03:20:11 (oui Unknown) > 0e:00:00:03:10:11 (oui Unknown), ethertype IPv4 (0x0800), length 82: (tos 0x2,ECT(0), ttl 64, id 0, offset 0, flags [DF], proto SCTP (132), length 68) fct.56496 > 192.168.253.16.12345: sctp 1) [INIT] [init tag: 2640168972] [rwnd: 1048576] [OS: 10] [MIS: 65535] [init TSN: 1495769599] 00:31:36.755852 0e:00:00:03:10:11 (oui Unknown) > 0e:00:00:03:20:11 (oui Unknown), ethertype IPv4 (0x0800), length 312: (tos 0x2,ECT(0), ttl 64, id 0, offset 0, flags [DF], proto SCTP (132), length 292) 192.168.253.16.12345 > fct.56496: sctp 1) [INIT ACK] [init tag: 321187571] [rwnd: 1048576] [OS: 10] [MIS: 10] [init TSN: 4058089694] 00:31:36.757278 0e:00:00:03:20:11 (oui Unknown) > 0e:00:00:03:10:11 (oui Unknown), ethertype IPv4 (0x0800), length 278: (tos 0x2,ECT(0), ttl 64, id 0, offset 0, flags [DF], proto SCTP (132), length 264) fct.56496 > 192.168.253.16.12345: sctp 1) [COOKIE ECHO] 00:31:36.757418 0e:00:00:03:10:11 (oui Unknown) > 0e:00:00:03:20:11 (oui Unknown), ethertype IPv4 (0x0800), length 56: (tos 0x2,ECT(0), ttl 64, id 0, offset 0, flags [DF], proto SCTP (132), length 36) 192.168.253.16.12345 > fct.56496: sctp 1) [COOKIE ACK] 00:31:37.477246 0e:00:00:03:20:11 (oui Unknown) > 0e:00:00:03:10:11 (oui Unknown), ethertype IPv4 (0x0800), length 2078: (tos 0x2,ECT(0), ttl 64, id 47270, offset 0, flags [none], proto SCTP (132), length 2064) fct.56496 > 192.168.253.16.12345: sctp 1) [DATA] (B) [TSN: 1495769599] [SID: 0] [SSEQ 0] [PPID SBc-AP] [Payload: 0x0000: 4141 4141 4141 4141 4141 4141 4141 4141 AAAAAAAAAAAAAAAA 0x0010: 4141 4141 4141 4141 4141 4141 4141 4141 AAAAAAAAAAAAAAAA ... 0x0520: 4141 4141 4141 4141 4141 4141 4141 4141 AAAAAAAAAAAAAAAA 0x0530: 4141 4141 4141 4141 4141 4141 4141 4141 AAAAAAAAAAAAAAAA] 2) [DATA] (E) [TSN: 1495769600] [SID: 0] [SSEQ 0] [PPID SBc-AP] [Payload: 0x0000: 4141 4141 4141 4141 4141 4141 4141 4141 AAAAAAAAAAAAAAAA 0x0010: 4141 4141 4141 4141 4141 4141 4141 4141 AAAAAAAAAAAAAAAA ... 0x0270: 4141 4141 4141 4141 4141 4141 4141 4141 AAAAAAAAAAAAAAAA 0x0280: 4141 4141 4141 4141 4141 4141 4141 4141 AAAAAAAAAAAAAAAA] 00:31:37.477368 0e:00:00:03:10:11 (oui Unknown) > 0e:00:00:03:20:11 (oui Unknown), ethertype IPv4 (0x0800), length 64: (tos 0x2,ECT(0), ttl 64, id 0, offset 0, flags [DF], proto SCTP (132), length 48) 192.168.253.16.12345 > fct.56496: sctp 1) [SACK] [cum ack 1495769600] [a_rwnd 1046576] [#gap acks 0] [#dup tsns 0] 00:31:38.630397 0e:00:00:03:20:11 (oui Unknown) > 0e:00:00:03:10:11 (oui Unknown), ethertype IPv4 (0x0800), length 54: (tos 0x2,ECT(0), ttl 64, id 47554, offset 0, flags [none], proto SCTP (132), length 40) fct.56496 > 192.168.253.16.12345: sctp 1) [SHUTDOWN] 00:31:38.630537 0e:00:00:03:10:11 (oui Unknown) > 0e:00:00:03:20:11 (oui Unknown), ethertype IPv4 (0x0800), length 56: (tos 0x2,ECT(0), ttl 64, id 0, offset 0, flags [DF], proto SCTP (132), length 36) 192.168.253.16.12345 > fct.56496: sctp 1) [SHUTDOWN ACK] 00:31:38.630682 0e:00:00:03:20:11 (oui Unknown) > 0e:00:00:03:10:11 (oui Unknown), ethertype IPv4 (0x0800), length 50: (tos 0x2,ECT(0), ttl 64, id 47555, offset 0, flags [none], proto SCTP (132), length 36) fct.56496 > 192.168.253.16.12345: sctp 1) [SHUTDOWN COMPLETE] 9 packets captured 9 packets received by filter 0 packets dropped by kernel 5. Summary: Kernel's decision of skb offload is based on skb_shinfo(head)->gso_size (GSO_BY_FRAGS) which is not set correctly after overwriting of transport->pathmtu. As the result internal SCTP gso variable is not set correctly, so subsequent skb configuration is not applied: if (gso) { memset(head->cb, 0, max(sizeof(struct inet_skb_parm), sizeof(struct inet6_skb_parm))); skb_shinfo(head)->gso_segs = pkt_count; skb_shinfo(head)->gso_size = GSO_BY_FRAGS; This causes required offload not to be performed and behavior of kernel is changed. 6. We need to fix this because: 1. such change of behavior (overwrite of transport->pathmtu) is violating user configuration. 2. after passing not offloaded frame from sender to router, router need to perform fragmentation to pass it to the receiver or frame might be dropped. 7. Proposed change is to make transport->pathmtu update dependent on spp_flags. If SPP_PMTUD_DISABLE for transport is set then return true, when checking transport->pathmtu (sctp_transport_pmtu_check()). Jacek Szafraniec (1): sctp: do not update t->pathmtu when PMTUD is disabled include/net/sctp/sctp.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- 2.10.2