On 2022-10-31 09:38, Florian Westphal wrote: > sriram.yagnaraman@xxxxxxxx <sriram.yagnaraman@xxxxxxxx> wrote: >> From: Sriram Yagnaraman <sriram.yagnaraman@xxxxxxxx> >> >> This patch introduces a new proc entry to disable source port >> randomization for SCTP connections. > Hmm. Can you elaborate? The sport is never randomized, unless either > 1. User explicitly requested it via "random" flag passed to snat rule, or > 2. the is an existing connection, using the *same* sport:saddr -> daddr:dport > quadruple as the new request. > > In 2), this new toggle prevents communication. So I wonder why ... Thank you so much for the detailed review comments. My use case for this flag originates from a deployment of SCTP client endpoints on docker/kubernetes environments, where typically there exists SNAT rules for the endpoints on egress. The *user* in this case are the CNI plugins that configure the SNAT rules, and some of the most common plugins use --random-fully regardless of the protocol. Consider an SCTP association A -> B, which has two paths via NAT A and B A: 1.2.3.4:12345 B: 5.6.7.8/9:42 NAT A: 1.2.31.4 (used for path towards 5.6.7.8) NAT B: 1.2.32.4 (used for path towards 5.6.7.9) ┌───────┐ ┌───┐ ┌──► NAT A ├───► │ ┌─────┐ │ └───────┘ │ │ │ A ├───┤ │ B │ └─────┘ │ ┌───────┐ │ │ └──► NAT B ├───► │ └───────┘ └───┘ Let us assume in NAT A (1.2.31.4), the connections is setup as ORIGINAL TUPLE REPLY TUPLE 1.2.3.4:12345 -> 5.6.7.8:42, 5.6.7.8.42 -> 1.2.31.4:33333 Let us assume in NAT B (1.2.32.4), the connections is setup as ORIGINAL TUPLE REPLY TUPLE 1.2.3.4:12345 -> 5.6.7.9:42, 5.6.7.8.42 -> 1.2.32.4:44444 Since the port numbers are different when viewed from B, the association will not become multihomed, with only the primary path being active. Moreover, on a NAT/middlebox restart, we will end up getting new ports. I understand this is a problem in the way SNAT rules are configured, my proposal was to have this flag as a means of preventing such a problem even if the user wanted to. >> As specified in RFC9260 all transport addresses used by an SCTP endpoint >> MUST use the same port number but can use multiple IP addresses. That >> means that all paths taken within an SCTP association should have the >> same port even if they pass through different NAT/middleboxes in the >> network. > ... the rfc mandates this, especially given the fact that endpoints have > 0 control on middlebox behaviour. > > This flag will completely prevent communication in case another > middlebox does sport randomization, so I wonder why its needed -- I see > no advantages but I see a downside. Since the flag is optional, the idea is to enable it only on hosts that are part of docker/kubernetes environments and use NAT in their datapath. >> Disabling source port randomization provides a deterministic source port >> for the remote SCTP endpoint for all paths used in the SCTP association. >> On NAT/middlebox restarts we will always end up with the same port after >> the restart, and the SCTP endpoints involved in the SCTP association can >> remain transparent to restarts. > Can you elaborate? If we're the middlebox and we restarted, we have no > record of the "old" incarnation so there is no sport reallocation. > >> Of course, there is a downside as this makes it impossible to have >> multiple SCTP endpoints behind NAT that use the same source port. > Hmm? Not following. > > 1.2.3.4:12345 -> 5.6.7.8:42 > 1.2.3.4:12345 -> 5.6.7.8:43 > > ... should work fine. Same for > 1.2.3.4:12345 -> 5.6.7.8:42 > 1.2.3.4:12345 -> 5.6.7.9:42 I meant to note the limitation you rightly pointed above, that when there is an existing connection, using the *same* sport:saddr -> dport:daddr, the new connection attempt will be dropped. For e.g. 1.2.3.41:12345 -> 5.6.7.8:42 (Existing connection) 1.2.3.42:12345 -> 5.6.7.8:42 (This connection request will fail) Will be translated to conflicting reply tuples 1.2.3.40:12345 <- 5.6.7.8:42 1.2.3.40:12345 <- 5.6.7.8:42 >> But, this is a lesser of a problem than losing an existing association >> altogether. > Can you elaborate? How is an existing assocation lost? > For example, what sequence of events is needed to result in loss of > an existing association? Consider two SCTP associations A -> C and B -> C A: 1.2.3.41:12345 B: 1.2.3.42:12345 C: 5.6.7.8:42 NAT public IP: 1.2.3.40 ┌─────┐ │ A ├───┐ └─────┘ │ ┌─────┐ ┌─────┐ ├────► NAT ├───► C │ ┌─────┐ │ └─────┘ └─────┘ │ B ├───┘ └─────┘ Let us assume in NAT (1.2.3.40), the connections are setup as ORIGINAL TUPLE REPLY TUPLE 1.2.3.41:12345 -> 5.6.7.8:42, 5.6.7.8.42 -> 1.2.3.40:12345 1.2.3.42:12345 -> 5.6.7.8:42, 5.6.7.8.42 -> 1.2.3.40:44444 After a restart there is a chance that the connections will be setup as ORIGINAL TUPLE REPLY TUPLE 1.2.3.41:12345 -> 5.6.7.8:42, 5.6.7.8.42 -> 1.2.3.40:55555 1.2.3.42:12345 -> 5.6.7.8:42, 5.6.7.8.42 -> 1.2.3.40:12345 >From C's point of view the port numbers for the two associations will be different after a restart, hence there can be no communication using the existing associations.