> -----Original Message----- > From: Sriram Yagnaraman <sriram.yagnaraman@xxxxxxxx> > Sent: Friday, 13 January 2023 10:04 > To: Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> > Cc: netfilter-devel@xxxxxxxxxxxxxxx; Florian Westphal <fw@xxxxxxxxx>; > Marcelo Ricardo Leitner <mleitner@xxxxxxxxxx>; Long Xin > <lxin@xxxxxxxxxx>; Claudio Porfiri <claudio.porfiri@xxxxxxxxxxxx> > Subject: RE: [RFC PATCH] netfilter: conntrack: simplify sctp state machine > > > > > -----Original Message----- > > From: Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> > > Sent: Thursday, 12 January 2023 12:50 > > To: Sriram Yagnaraman <sriram.yagnaraman@xxxxxxxx> > > Cc: netfilter-devel@xxxxxxxxxxxxxxx; Florian Westphal <fw@xxxxxxxxx>; > > Marcelo Ricardo Leitner <mleitner@xxxxxxxxxx>; Long Xin > > <lxin@xxxxxxxxxx>; Claudio Porfiri <claudio.porfiri@xxxxxxxxxxxx> > > Subject: Re: [RFC PATCH] netfilter: conntrack: simplify sctp state > > machine > > > > On Wed, Jan 11, 2023 at 09:36:38AM +0000, Sriram Yagnaraman wrote: > > > > -----Original Message----- > > > > From: Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> > > > > Sent: Friday, 6 January 2023 01:50 > > > > To: Sriram Yagnaraman <sriram.yagnaraman@xxxxxxxx> > > > > Cc: netfilter-devel@xxxxxxxxxxxxxxx; Florian Westphal > > > > <fw@xxxxxxxxx>; Marcelo Ricardo Leitner <mleitner@xxxxxxxxxx>; > > > > Long Xin <lxin@xxxxxxxxxx>; Claudio Porfiri > > > > <claudio.porfiri@xxxxxxxxxxxx> > > > > Subject: Re: [RFC PATCH] netfilter: conntrack: simplify sctp state > > > > machine > > > > > > > > On Thu, Jan 05, 2023 at 12:11:44PM +0000, Sriram Yagnaraman wrote: > > > > > > -----Original Message----- > > > > > > From: Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> > > > > > > Sent: Thursday, 5 January 2023 12:54 > > > > > > To: Sriram Yagnaraman <sriram.yagnaraman@xxxxxxxx> > > > > > > Cc: netfilter-devel@xxxxxxxxxxxxxxx; Florian Westphal > > > > > > <fw@xxxxxxxxx>; Marcelo Ricardo Leitner <mleitner@xxxxxxxxxx>; > > > > > > Long Xin <lxin@xxxxxxxxxx>; Claudio Porfiri > > > > > > <claudio.porfiri@xxxxxxxxxxxx> > > > > > > Subject: Re: [RFC PATCH] netfilter: conntrack: simplify sctp > > > > > > state machine > > > > > > > > > > > > On Thu, Jan 05, 2023 at 11:41:13AM +0000, Sriram Yagnaraman > wrote: > > > > > > > > -----Original Message----- > > > > > > > > From: Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> > > > > > > > > Sent: Wednesday, 4 January 2023 16:02 > > > > > > > > To: Sriram Yagnaraman <sriram.yagnaraman@xxxxxxxx> > > > > > > > > Cc: netfilter-devel@xxxxxxxxxxxxxxx; Florian Westphal > > > > > > > > <fw@xxxxxxxxx>; Marcelo Ricardo Leitner > > > > > > > > <mleitner@xxxxxxxxxx>; Long Xin <lxin@xxxxxxxxxx> > > > > > > > > Subject: Re: [RFC PATCH] netfilter: conntrack: simplify > > > > > > > > sctp state machine > > > > > > > > > > > > > > > > On Wed, Jan 04, 2023 at 12:31:43PM +0100, Sriram > > > > > > > > Yagnaraman > > > > wrote: > > > > > > > > > All the paths in an SCTP connection are kept alive > > > > > > > > > either by actual DATA/SACK running through the > > > > > > > > > connection or by > > HEARTBEAT. > > > > > > > > > This patch proposes a simple state machine with only two > > > > > > > > > states OPEN_WAIT and ESTABLISHED (similar to UDP). The > > > > > > > > > reason for this change is a full stateful approach to > > > > > > > > > SCTP is difficult when the association is multihomed > > > > > > > > > since the endpoints could use different paths in the > > > > > > > > > network during the lifetime > > > > of an association. > > > > > > > > > > > > > > > > Do you mean the router/firewall might not see all packets > > > > > > > > for association is multihomed? > > > > > > > > > > > > > > > > Could you please provide an example? > > > > > > > > > > > > > > Let's say the primary and alternate/secondary paths between > > > > > > > the SCTP endpoints traverse different middle boxes. If an > > > > > > > SCTP endpoint detects network failure on the primary path, > > > > > > > it will switch to using the secondary path and all > > > > > > > subsequent packets will not be seen by the middlebox on the > > > > > > > primary path, including SHUTDOWN sequences if they happen at > that time. > > > > > > > > > > > > OK, then on the primary middle box the SCTP flow will just timeout? > > > > > > (because no more packets are seen). > > > > > > > > > > Yes, they will timeout unless the primary path comes up before > > > > > the SHUTDOWN sequence. And the default timeout for an > > > > > ESTABLISHED > > SCTP > > > > > connection is 5 days, which is a "long" time to clean-up this entry. > > > > > > > > Does the middle box have a chance to see any packet that provides > > > > a hint to shorten this timeout? no HEARTBEAT packets are seen in > > > > this case on the former primary path? > > > > > > There will be HEARTBEAT as soon as a path becomes unreachable from > > > the SCTP endpoints. But depending on the location of the network > > > failure, the middlebox may or may not see the HEARTBEAT. > > > > Conntrack assumes you have see all traffic that belongs the flow for > > other protocols too. > > > > > Also, HEARTBEAT is sent when there are no data to be transmitted or > > > if the path is unreachable/unconfirmed, so I think there is no > > > deterministic way of finding out when to shorten the timeout. Of > > > course, a user has the option of setting the ESTABLISHED state > > > timeout to a more reasonable value, for e.g., same as the > > > HEARTBEAT_ACKED state timeout (210 sec), OR we could reduce the > > > default timeout of ESTABLISHED to 210 sec. > > > > Then just set up a short ESTABLISHED when multihoming is in place > > since the beginning. > > > > > > What I am missing are a more detailed list of issues with the > > > > existing approach. Your patch description says "SCTP tracking with > > > > multihoming is difficult", probably a list of scenarios would help > > > > to understand the motivation to simplify the state machine. > > > > > > Thank you for reviewing and asking these questions, it made me step > > > back and think. I list below some background > > > - I want to simplify the state machine, because it is possible to > > > track an SCTP connection with fewer states, for e.g., SCTP with UDP > > > encapsulation uses UDP conntrack with just UNREPLIED/REPLIED states > > > and it works perfectly fine > > > > I think it would preferrable to add some configuration via ruleset to > > track SCTP over UDP, rather than deranking SCTP to become almost > stateless. > > Okay 😊 > > > > > > - My stakeholders, at the behest of whom I am proposing these > > > changes hit some problems running SCTP client endpoints behind NAT > > > (inside Kubernetes pods) towards multihomed SCTP server endpoints > > > (1a-g) and (2a-c) below > > > - Some upcoming SCTP protocol changes in IETF (if > > > approved/implemented) will make it hard to read beyond the SCTP > > > common header, for e.g., DTLS over SCTP > > > https://datatracker.ietf.org/doc/draft-ietf-tsvwg-dtls-over-sctp-bis > > > /, proposes to encrypt all SCTP chunks, conntrack will only be able > > > to see SCTP common header, these changes hopefully will make it > > > easier to adapt to such changes in SCTP protocol - While at it, I > > > also made some other "improvements" > > > > For this DTLS case it should be possible to fall back to the SCTP "stateless" > > approach. > > > > > a) Avoid multiple walk-throughs of SCTP chunks in sctp_new(), > > sctp_basic_checks() and nf_conntrack_sctp_packet(), and parse it only > > once > > > b) SCTP conntrack has the same state regardless of it is a primary > > > or a secondary path > > > > > > Let's say there are two SCTP endpoints A and B with addresses A' and B, B'' > > correspondingly. > > > Primary path is A' <----> B' that traverses middlebox C, and > > > secondary path is > > A' <----> B'' that traverses middlebox D. > > > 1) SHUTDOWN sent on secondary path > > > 1a) SCTP endpoint A sets up an association towards SCTP endpoint B > > > 1b) Middlebox C sees INIT sequence and creates "primary" conntrack > > > entry (5 days) > > > 1c) Middlebox D sees HEARTBEAT sequence and creates "secondary" > > > conntrack entry (210 seconds) > > > 1d) Path failure between A and C, and SCTP endpoint A switches to > > > secondary path and continues sending data on the association > > > 1e) SCTP endpoint A decides to SHUTDOWN the connection > > > 1f) Middlebox C is in ESTABLISHED state, doesn't see any SHUTDOWN > > > sequence or HEARTBEAT, waits for timeout (5 days) > > > 1g) Middlebox D is in HEARTBEAT_ACKED state, doesn't care about > > > SHUTDOWN sequence, waits for timeout (210 seconds) > > > > I guess similar problem will occur with MP-TCP, and I am not sure > > taking TCP to be more stateless is the way to address this. > > Ok, I am a newbie to this area and am most probably mistaken, so forgive my > naive question below. > Shouldn't conntrack understand as less as possible about the protocol, and > parse the bare minimum from the packet to detect that an active connection? > For packet filtering/firewall, I understand we will need deep packet inspection, > but is conntrack the place to do that? > > > > > > 2) Recently fixed by bff3d0534804 ("netfilter: conntrack: add sctp > > > DATA_SENT state ") > > > 2a) SCTP endpoint A sets up an association towards SCTP endpoint B > > > 2b) Middlebox C sees INIT sequence and creates "primary" conntrack > > > entry (5 days) > > > 2c) Middlebox D sees DATA/SACK, and DROPS packets until HEARTBEAT is > > > seen to setup "secondary" conntrack entry (210 seconds) > > > > I assume this is already fixed. > > > > Another possibility would be to introduce this alternative > > state-machine and use it for multihoming? > > Or I could unify the established states for both the connection that saw an > INIT/INIT_ACK sequence and HEARTBEAT/HEARTBEAT_ACK sequence and use > the HEARTBEAT_ACKED state timeout for both. That way, there is no > difference from a conntrack perspective between "primary" and "secondary" > connections. I can send another patch if the group here thinks this is a good > idea. Here it is: https://lore.kernel.org/netfilter-devel/20230116093556.9437-1-sriram.yagnaraman@xxxxxxxx/T/#t